00 

o 
o 

(N 

Ph. 






Three Dimensional Corners: 
A Box Norm Proof 



<^ : Michael T. Lacey William McClain 

00 



April 18, 2008 



Abstract 



d dimensional vector 



For any discrete additive abelian group (G, +), we define a d-dimensional corner to 
be the d + 1 points in G^ given by 

> 

0\'. g,g + her, l<r<d, heG-{Q}, 

O: e,= (a...,l,...,0), l<r<d. 

cn 
^' 

^ ■ The Ramsey numbers of interest are R{G,d), the maximum cardinality of a subset 

Q ! A c G^ which does not contain a rf-dimensional corner. 

We give a new proof of a special case of the Theorem of Furstenberg and Katznelson 
1^. in that in dimension d - 3, for the group G a finite field of characteristic 5, 

rs : 

U; R(F^,3) = o(|F^|3), n^c«. 

Our proof, specialized to one dimension, would reduce to Gowers' proof [I'l of four 
term arithmetic progressions in dense subsets of the integers. (Also see [7].) Neverthe- 
less, there are significant difficulties to overcome, and as a result this proof does not 
yield new quantitative bounds. 
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1 Introduction 

For any discrete abelian group (G, +), we define a d-dimensional corner to be the d + 1 points 
in G^ given by 

g,g + h{l,0,0,,...,0),g + h{0,l,0,...,0),...,g + h{0,0,0,...,l), heG-{0}. 

The Ramsey numbers of interest are R{G, d), the maximum cardinality of a subset A c G^ 
which does not contain a d-dimensional comer. 

The principal result in the subject is the Theorem of Furstenberg and Katznelson [3], a 
generalization of the Szemeredi Theorem [|22 | | to arbitrary dimension. 

1.1 Furstenberg-Katznelson Theorem. We have the estimate below, for any dimension d. 

R{XN,d) = o{N'^), N^oo. 

Our principal result of this result is a new proof of this Theorem, in dimension d = 3, 
for a finite field. 

1.2 Main Theorem. We have this estimate, where N = 5" = \F'^\, 

RiWl3) = o{N^), n^oo. 

The quantitative bound we provide is of Ackerman type, and accordingly we do not 
attempt to specify it. In the two dimensional case, there is a much better quantitative 
bound, doubly logarithmic in nature, due to Shkredov Ill8lll9i . 



1.3 Shkredov's Two Dimensional Theorem. There is a < c < 1 for which we have the 
estimate below in the two dimensional case. 

(log log N)^ 

In the simpler case of the finite field, one can get a better estimate, in that the constant 
c can be specified. See [15J, also HI. Indeed it would appear that any improvement in the 
constant below would require new ideas. 

1.4 Theorem. In the finite field setting, we have the estimate below in the two dimensional case. 
Set N = p"for prime p. 

^loeloeloeN 
R{W"2)<N^ f f ° , N^oo. 



v 



log log N 



Our methods of proof are those of arithmetic combinatorics, which in most instances 
give better quantitative bounds. However in this proof, our bounds are of Ackerman type. 
It took some time for a purely combinatorial proof of the Furstenberg-Katzneslon proof to 
be found ||5H6l|16l and the commentary in ||20ll . Thus, our proof using the Gowers norms 
1120 L and the double recursion argument of Shkredov IITSB , might have some independent 
interest. 

The Theorem we discuss is the first 'hard' case, as it corresponds to four-term arithmetic 
progressions IJHIZll. The 'hardness' is expressed in terms of the very weak information 
that we get from the Box Norm, an issue we go into in more depth in the next section, see 
also §111 The rigorous results on Box Norm are Lemma [S^ below, and a more sophisticated 
variant Lemma [83l 

A central question in the subject of Ergodic Theory concerns the identification of the 
characteristic factors for multi-linear ergodic averages, especially in the sense of Host and 
Kra ||T2] - [T4| . In the case of commuting transformations, the only complete information 
about these factors is in the case of two commuting transformations, a result of Conze and 
Lesigne |!3, also [[T4|. Incorporating their results in to a proof of Shkredov's Theorem is 
of substantial interest. Our ignorance of these factors is also a hindrance in the result of 
Bergelson, Leibman and Lesigne |[ll. Perhaps this approach can shed some light on this 
question. 

There should be no essential difficulty in rewriting this proof to treat the estimate 
R(Zjv, 3) = o{N^). We have adopted the finite field setting just as a matter of convenience. 



making the arguments of §|9]technically a little easier (though admittedly there is little gain 
in simplicity by this choice.) It appears to be an interesting question, requiring additional 
insight, to extend this argument to higher dimensions. 

Acknowledgment. The first author completed part of this work while in residence at the 
Fields Institute, Toronto Canada, as a George Eliot Distinguished Visitor. Support and 
hospitality of that Institute is gratefully acknowledged. The second author has been 
supported by a NSF VIGRE grant at the Georgia Institute of Technology. 



2 Overview of the Proof 

There is a substantial jump in difficulty of the proof in passing from the two dimensional 
case to the three case. The three dimensional case, projected back to one dimension, 
gives a result about four term arithmetic progressions, explaining part of this difficulty. 
Accordingly, we begin with a description of the two dimensional case. 

In two dimensions, the are three important coordinate directions: ei = (1, 0), e2 = (0, 1), 
and es = ei + e2, associated with the endpoints of the comers. 

We exploit these three choices of coordinate directions by this mechanism. Consider 
three functions Aj : Z^ — > Z^ given by 

(2.1) Aj{xi,X2,X3)= Y^ Xkek 

k : Uj 

The point of these definitions is that Aj is not a function of Xj. 
For a given set A c Z^, the expected number of corners in A is 

^xi,X2,X3eZN^i^l' ^2)^(^1 + X3,X2)A{Xi,X2 + X3) 

= E;,j,x2,X36Zjv^(^l/ X2)A{X3 - X2, X2)A{Xi, X^ - Xi) (X3 ^ X3 - Xi - X2) 

3 

= ^x^,r2,X3eX^ 11'^° ^i^^^' ^2' ^s) • 

Each of the three functions is a function of just two of the three variables Xi, X2, X3. 



There is a specific mechanism to address expectations of such products: the Gowers 
Box norms. Define one of these norms on a function g of Xi, X2 as follows. 

(2.2) llglln(l,2| = [^x^,x[,X2,x'^eZNg{Xl, X2)g{x\, X2)g{Xi, x'^Jgix^, Xj)] 

which is the cross-correlation of g at the four points of an average rectangle selected from 
Z]v X Z^. Write 6 = P(A), and f = A - 6, which is, following Gower's terminology, the 
balanced function of A. We then expand one of the A's in the expectation above as A = 6+f, 



E 



^1,^2,^3 



eZf, ]^ A o Aj{x-i,X2,X3) = Ci + C2 



7=1 



Ci = 6E;,j,;,2,X3eZN 11^ ° A;(Xi,X2,X3) 
7=1 
2 

C2 = E.^j,;,2,^36Zn/ o A3 II a o Ay(Xi,X2,X3) 



For the first of these terms, one can check directly that 

Ci>6E,JE,,A(xi,X2)P>53. 

For sets A with the number of corners approximately equal to the number of corners that 
one would naively expect, this should be the dominant term. On the other hand, it is the 
import and power of the Gowers Box Norms that we have the inequality 

(2.3) ICol < ||/||n(i,2) 

Thus, if this last quantity is less than, say, ^5^, the A has at least one-half of the expected 
number of comers. 

There is however, the alternative that ||/||n|i,2| ^ j5^> which point brings us to an 
unfortunate fact concerning these Box Norms: The definition in (|2.2|) makes perfect sense 
on the product of arbitrary probability spaces. Accordingly, the consequence of the Box 
Norm being large can only have a probabilistic consequence. In the two dimensional case, 
it is this: There is are subsets Ri, R2 c Zjv so that A correlates with the product set Ri x R2, 
namely P(A | Ri x R2) > 6 + \6^'^, and the product set Ri x R2 is non-trivial, in that we 
have the estimates P(Ri), P(R2) ^ cb^^, for appropriate constant c. There is however no 
additional structure on the sets Ri and R2. 



The natural path, originating in Roth's proof |T7| for three term arithmetic progressions, 
is to iterate this alternative. We can only hope to achieve an increment in density of A by 
an amount of b^^ a finite number of times. But without an additional insight, the iteration 
cannot go forward as the use of the Gowers Box Norms requires at least a little arithmetic 



information through the use of the change of variables. Shkredov iTlSl found a solution 
to this problem by introducing a secondary iteration, the result of which is that one finds 
further subsets R^ c Ri and R2 c R2 which satisfy three conditions. First, we maintain the 
property that A has a higher density on R^ x K^, namely P(A | R^ x R2) > 6 + \b^'^. Second, 
the sets R^ and K^ are non-trivial, in that they have a lower bound on their probabilities. 
Third, R^ and Rj have arithmetic properties, in that their one-dimensional Box Norms are 
small. Specifically, Ri,R2 are subsets of a subspace H < W^, where there is a lower bound 
on the dimension of H, and the norms 

||Ry(Xl + X2) - P(i^; I H)H(Xi + X2)llnli.2lHxH / / = 1/ 2 

are small. The first two conditions are certainly required. It is the third property that 
permits the iteration to continue, as a subtle refinement of the inequality (|2.3|) is available. 

There is one additional feature of this discussion that we should bring forward, as it 
plays a decisive role in the three-dimensional case. Namely, the discussion above placed 
a distinguished role on the standard basis (ei, e2), whereas the formulation of the question 
makes sense any any choice of basis from the three vectors {ei,e2,e3}. One can phrase a 
'coordinate-free' version of Shkredov's argument, which is the viewpoint of [,15,|. This is 
the viewpoint we adopt in the three-dimensional case. 

We turn to the three dimensional case. We again have the the standard basis e^, for 
] = l,2,3inZ^. The fourth relevant basis element is e4 = X!,/=ie; associated to the endpoints 



of the comer. The analogs of the functions Ay in (|2.1|) are now four distinct functions from 

^N^^N given by 

Ay(Xi, X2/ ^3/ •^4) = ^ '^k^k ■ 
k : k+\ 

The point to exploit is that Ay is not a function of Xy. 

For a given set A c Z^, the average number of comers in A is given by 

3 4 

^xi,X2,X3,x,eZ,f,A{x-i,X2,X3)Y\MiXl,X2,X3) + X4 ey) = E;,j,;,2,X3,;,^6Z^ ]^ A o Aj{Xi,X2,X3,Xi) , 

;=1 ;=1 



This is a four-linear term, which each of the four terms being dependent upon just three 
variables. 

Again, there is a Gowers Box Norm that is relevant. This norm, of a function g{xi, X2, X3) 
has a definition that can be given recursively as 

||^(Xi,X2,X3)lla,i^2,3| = |||Ex36Z^g(Xl, ^2, Xs)^!^ ^l 

It has a similar interpretation as the average cross-correlation of g at the eight comers of 
a randomly chosen box in Z^. To exploit the norm, we make the same expansion of A. 
Setting 6 = P(A | Z^), and write A = 6 + f. Use this expansion just on A o A4 above, so that 
we can write 



3 

Ci = 6E;,j,^2.T3,x4eZN _Q A o Ay 
7=1 
3 

Co = B:,^,x2,X3,XieZf,f o A4 fl A o Ay . 



The Box Norm is introduced because it controls the second term. 

(2.4) ICol < ||/||n(i,2,3) . 

Thus, if the Box Norm is sufficiently small, Co should be negligible. Turning to the term 
Ci, typically we would expect Ci to be of the order of 6'^, but we do not have any simple 
recourse to establishing such a bound. Indeed, Ci is an instance of the two-dimensional 
question, as Ci is 6 times the average number of two-dimensional comers in A, with the 
two-dimensional corners located on hyperplanes of the form (xi, X2, X3) -e^ = c, for some c. 

This suggests to us that we will need to use a two-dimensional Box Norm on the 
hyperplanes just described. Namely, and this is an essential point, control of the Box 
Norm in (|2.4|) is not sufficient to control the number of comers in A. Control of one more 
Box Norm, in a second set of coordinates, is required. This situation can be avoided in the 
two-dimensional case. 



We adopt a method that places the four coordinate vectors {e^ | 1 < / < 4} on equal 
footing. For each choice of subset I c {1, 2, 3, 4}, we have a Box Norm corresponding to the 
basis for Z^ given by {ey \ j e I}. A sufficient condition for A to have a comer is that 



max \\f\\ai<2-'6\ 

Icjl,2,3,4) 
|/|=3 

These norms are distinct, namely that one can have ||/||n(i,2,3| very small, while ||/||n(i,2,4| 
is much larger, a situation that does not arise in the one-dimensional case, as all of these 
norms turn out to be the same after a change of variables. 

Turning to the alternative, suppose that we have ||/||n|i,2,3| > 2~^6'^. Again, the Box 
Norm admits a formulation on the three-fold product of probability spaces. Accordingly 
we can only have a probabilistic consequence of the Box Norm being large, and it is a 
dramatically weaker statement than in the two-dimensional case. It is this: Associate Z^ 
to 'Z^ , with the superscripts signifying the coordinates. For / C {1, 2, 3} of cardinality 2, 
associate ZJ^ to the corresponding face of ZJ^'^'"''. For each such /, there is a subset Rj c ZJ^. 
Consider the fibers that lie above this set, denoted by 

Rj = {{Xi,X2,X3) G Zj;-^'^' I {(Xi,X2,X3) • 6; | / S /} S R,] . 

Then, the conclusions are two fold. First, A has a higher density in H/cii 1 3| ^// and second 

1/1=2 

the latter set is non-trivial, in that it admits a lower bound on its probability. Namely, the 
conclusions are 



(2.5) p(a| Y[ ^/)>5 + c6^, 

/c(l,2,3| 
1/1=2 

(2.6) P( Yl R,)>c6^. 



/c|l,2,3| 

1/1=2 



Here < c, C are absolute constants. Note that both conclusions are substantive. There is 
no a priori reason that the set in (|2.6)) should admit this lower bound in its probability. The 



other conclusion (|2.5|) gives a correlation with a set, unfortunately, this set has substantially 



less structure than in the two-dimensional case. 

Another essential complication arises from the fact that one must consider the 6 sets Rj, 
for / c {1,2,3,4}, / consisting of two elements. If we consider the three-fold intersection 



n/c{i,2,3) Rj > one can see that it is well-behaved with respect to corners if the individual 

1/1=2 

sets -Rj are well-behaved with respect to two-dimensional Box Norms, and their one- 
dimensional projections are well-behaved with respect to the 1/(3) norm. 

But, there is no reason that the 3-dimensional set formed from the 6-fold intersection 
n/c|i,2,3,4| R] should be well-behaved with respect to any Box Norm. To overcome this 
difficulty, we introduce an auxiliary set T c Rj for all /. This set is required to be uniform 
with respect to all four three-dimensional Box Norms, but the Box Norm is taken relative 
to the sets Rj. 

We are left with the following task: Find the appropriate 'uniformity' conditions on the 
sets Rj and the set T so that these conditions are met. First, we can obtain a variant of the 
inequality (|2.4|) , namely if the set A is uniform in the 'Box Norms adapted to T' then A has 
a corner. Second, assuming that A is not uniform with respect to a 'Box Norms adapted to 
r,' then we can find suitable variants of (|2.5|) and (|2.6|) . 

This must be done in a manner that is consistent with the choice of any of the four 
possible coordinate systems from {ei,e2,e3,e4}. 

The remainder of the paper is organized as follows. 

• § |3] presents the most important definitions and three Lemmas which combine to 
prove our main result. Theorem 11.21 These three Lemmas set out, in broad terms the 
iteration scheme of Shkredov I1I81I , but the formulation of the definitions is hardly 
clear. 

- A critical definition is that of a comer-system. Definition 13.11 Such a system 
consists of the set A, in which we seek a corner, and a number of auxiliary sets, 
such as the sets Rj mentioned above. If the auxiliary sets are 'suitably uniform' 
the the corner-system is called admissible, see Definition 13 .4[ 

- A 'generalized von Neumann Lemma,' to use the phrase of Ben Green and 
Terrance Tao ||8|]. Lemma 13.131 states that if the corner-system is admissible, and 
A is suitably 'uniform' in a non-obvious sense (and A is not too small, a weak 
condition) then A has a comer. 

- An 'increment Lemma,' Lemma 13.161 This Lemma tells us that in the event 
that the hypothesis of of Lemma 13.131 fails, we can find a new comer-system, 
which is non-trivial, in which A has a larger density. It is this step that provides 

10 



termination in our iteration, as the density of a set can never exceed one. The 
non-triviality comes from suitable lower bounds on the probabilities associated 
to the sets in the comer-system. This Lemma, probabilistic in nature does not 
provide for an admissible comer-system. 

- A 'Uniformizing Lemma,' Lemma 13.171 in which a non-admissible corner- 
system is made admissible, permitting the recursion to continue. 

These three Lemmas are combined, in a known way see § [lOl to prove the Main 
Theorem. 

§ m sets out notation for the Box Norms which are essential for the entire paper, in 
particular the Gowers-Cauchy-Schwartz Inequality 14.21 These considerations have 
to be set out in some generality, as the later arguments will encounter a variety of 
Box Norms, and multi-linear forms consisting of up to 56 functions. Most, but not 
all, of this section is standard, but worked out in a setting in which the underlying 
sets have relatively large probabilities. 

§ |5] applies the results on the Box Norm to some classes of linear forms which arise 
in the context of the three-dimensional Box Norm. These results have proofs which 
are appropriate refinements of the proof of the Gowers-Cauchy-Schwartz Inequality, 
taking into account the fact that the underlying sets we are interested have very 
small probabilities. This section introduces a notion of uniformity with respect to 
linear forms of a bounded complexity. Definition 15.21 An important component 
of the argument, is that the sets we consider only have a uniformity in the sense 
of Definition 15.21 of a bounded complexity. Also in this section, and particularly 
important, is the First Proposition on Conservation of Densities, Proposition 15.111 
and its corollary Lemma [5.14[ 

§[6] is a reprise of the previous section. In principle, we could have written the one 
section to encompass both this section and § |5l but felt that this might make the 
paper harder to read. This section contains the Second Proposition on Conservation 
of Densities, Proposition 16.41 Both of these sections are central to the remainder of 
the argument. 

§[7|will prove the first of the three Lemmas, Lemma [3.13[ by a subtle reworking of 
a standard Box Norm inequality. In its simplest form, this argument was found by 
Shkredov [8J, but has a more refined elaboration in the current context. 

§[8]presents a Lemma we refer to as a 'Paley-Zygmund inequality for the Box Norm,' 
see Lemma 18.21 Namely, assuming that the Box Norm is big, deduce, e. g. , the 

11 



conclusions (|2.5|) and (|2.6[) above. This Lemma is presented in the simplest context 
in the two dimensional setting. We then present the same Lemma as above, but in 
the 'weighted context.' That is, in a context where the underlying spaces is not just 
a tensor product space. See Lemma l83l Both of these Lemmas a are stated in some 
generality, as the more general formulation is required in §|9l The main result of this 
section. Lemma 18. 3[ requires a careful elaboration of the proof in the 'unweighted' 
case. 

§|9]we address the fact that the data provided to us from Lemma |8^ and Lemma l83l 
does not have any uniformity properties. This is remedied by selecting a variety 
of partitions of the underlying space, with most of the 'atoms' of the partitions are 
sufficiently uniform. It is in this section that the Ackerman function will arise. The 
main Lemma is Lemma [3.171 

The three Lemmas of §|3]are combined to prove our main Theorem in§[TO| 



3 Principal Lemmata 

Our proof is recursive, with each step in the recursion identifying a new subspace H < F^ 
in which we work. H is of course a copy of F^, just with a smaller value of n. We maintain 
a lower bound on the dimension of H. 

HxHxH has the standard basis elements ei, e2, and e^. We also use the basis element 

e4 = ei + e2 + es , 

which is the element associated with the 'endpoints' of the corner. A corner has an 
equivalent description in terms of any three elements of the four basis elements {e, | 1 < 

i < 4}. 

Below, we will work with sets S„ 1 < i < 4. They can be viewed as elements of the field 
H. But in addition, we view them as subsets of H x H x H, as follows: 

Si = {xeHxHxH\x-eieSi} l<z<4. 
Thus, the fibers over Si are copies of H x H. 



12 



Likewise we will work with sets i^,j c SiXSj. They canbe viewed as subsets of HxHxH 
by setting 

Rj^k = {x eHxHxH\{x- ey, x-ek) eRj,k} , 1 < z < ; < 4 . 

Thus, the fibers of Rj^k are copies of H. 

3.1 Definition. By an corner-system we mean the data 

(3.2) ^ = {H,Si, Ri,j , r, A I 1 < i, ;■ < 4} 
where these conditions are met. 

1. H is a subspace of F^. 

2. Si c H, 1 < z < 4. 

3. Rj^k cSjXSk,l< i <k< 4. 

4. T c Rj^k, 1< j <k<4. 

5. Act. 

By a T-system we mean the data 

(3.3) T = {H,Si,Ri,j,T\l<i,i<4} 

which is the same as a corner system, except that the set A is not listed, and so condition 
(5) above is not needed. 

For such systems we use the notations 



T,:= Pi %, 1<^<4, 



l<;'</c<4 

6j := FiSj I H), 6j,k := P(% I S,- x s,), 1 < ; < fc < 4, 
6rK:=P(T|T,), 1<^<4. 



13 



The sets T( play an essential role in this proof for the following reason. They are 
built up from lower dimensional objects in a natural way, and presuming that the lower 
dimensional objects are themselves well behaved with respect to box norms, then the Tf is 
as well. The same conclusion does not seem to hold for the 6-fold intersection r\i<i<j<kRj,k- 
That in turn lead us to the introduction of the auxilary set T C Rj^. Working on this 
indeterminant set T leads to most of the complications of this paper. 

We use the notation Rj^k c Sy x Sk rather than the (more natural) Sj^k, as we will use the 
notation Sj^k '■= Sj x Sk, in association with a number of Box Norms throughout the paper. 

3.4 Definition. Let Cadmiss > 64 be a fixed large constant, and < Kadmiss < 1 be a fixed 
small constant. Given < e < 1, and T-system 7~ as in (|3.3|) , we say that 7~ is e-admissible 
iff 

(3.5) "^"t^;/^^""""^^' < Kad.iss£^— • P(T I T,)-— , 1 < ^ < 4 , 

(3.6) \\R,j - diJU^s.xsp < Kadmisse''^^"''^^P(T I H X H X H)^^^— , 1 < z < ; < 4 , 

(3.7) IIS, - 6,||u(3) < Kadmisse''^^'"'^^P(T I H X H X H)^^^™'- , 1 < z < 4 . 

All conditions require uniformity of the objects in terms of the density of T in that 
object. But the condition in (|3.5|) can not be strengthened in any way, and it is the condition 
that turns out to be the most subtle. In particular, it will turn out that we can compute the 
expression ||T^||n(,| /:^^| in (|3.5|) , but it is also the case that Te is not uniform with respect to 
the norm n{z | i i^ l\. 



The norms in p.5|) and (|3.6|) are detailed in Definition 14.11 and (|3.10|) , but also given 
explicitly in the next definition. 

3.8 Definition. Let X, Y and Z be finite sets. For any function / : X — > C, we use the 
notation for expectation, namely 



E,,ex/(x) = |X|-iJ]^/(x). 



Corresponding notation for probability P(A), conditional probabilities, and conditional 
expectations, and conditional variance are also used. 

For a function f : XxY — > R, define 

(3.9) ll/lln,.vi(xxy) ■= E.,.'ex/(x, y)/(x, y')f{x', y)f{x', y') . 

14 



Note that the right hand side is the average of the cross-correlation of / over all combina- 
torial rectangles in X x Y. 

For a function / : XxYxZ — > R, define 

ll/lln^'!''^(XxYxZ) •~ ^z,z'ez\\jvrfZ)j{-,-,Z )||q.v,../(xxY) 

= ^x,x'exf{x, y, z)/(x, y', z)/(x', y, z)/(x', y', z) 

z,z'eZ 

X /(x, y, z')/(x, y', z')/(x', y, z')/(x', y', z') . 
This has a similar interpretation as the norm in (|3.9|) . In (|3.5|) , we use the notation 

(3.10) \\g\\n{i I M] ■= llgllnl'l wi(HxHxH) • 

This notation is consistent with (|3.10|l below. 

The Lr(3) norm used in (|3.7[) has a definition that is similar to the Box Norms, but has 
an additive component. 

3.11 Definition, For / : H — > R, we define 

ll/llu(3) := \\f{x + y + z)||nWHxHxH 

In these definitions, observe 

• A 6 represents a 'density,' and this will most frequently be a relative density. Thus, 
6, J is the density of Rf^j in S, x Sj. In some of these notations, this relative density is 
indicated explicitly, as in the definition for 6t \ e- 

• Likewise, the Box Norms in (|3.5|) and (|3.6|) are relative Box Norms. In (|3.6|) , this 



relative norm is indicated in the notation. But, in (|3.5|) this is indicated by the 

division by ||THIn|;|,-^f). 

Notice that the uniformity conditions (|3.5|l — (|3.6|) are phrased relative to the the 
'higher dimensional objects in question.' Thus, the uniformity condition on T in (|3.5[) 
is phrased in terms of the densities of T in Te. 

The previous point, not anticipated by the two-dimensional version of this Theorem, 
is important to the proof of our critical Lemma [3. 171 below. And it complicates the 
proof of Lemma [3.13[ 
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• It is possible that the degree of uniformity require on S, in (|3.7|) and Ri^j in (|3.6|) is too 
high. For instance, one could imagine that (|3.7|) should be replaced by 

(3.12) lis, - 6,||u(3) < Ke^^'i--P(r I S,f^^'-''^ , 1 < z < 4 . 

As it turns out, the conditions (|3.7|) and (|3.6|) are available to us by this proof, and 
so we use them. The distinction between (|3.12|) and (|3.7|) could be important in 



extensions of this argument to higher dimensions. 

The three Lemmas are very much as in j TSllTSl , though with more complicated state- 
ments in the current setting. The first Lemma asserts that for admissible comer-systems, 
if dimension is not too small, and the Box Norms ||A - 6a \ TT\\n{i \ m] are sufficiently small, 
uniformly in £ then A has a corner. 

3.13 The von Neumann Lemma. Suppose that we are given an corner-system ^ as in (|3.2|) . Set 
6a| T = T*(>4 I T), and assume that ^ is bA\T-admissible. The following two conditions are then 
sufficient for A to have a corner. 

4 4 

(3.14) SAir-n-^i- n ^i,i^-]\^T\f\iit>m\, 

;=1 l<;<fc<4 i=l 

/0 1I-X \\A-^A\TT\\n{i\M] 4 

(3.15) max — <kOa\t- 

' 1^^^4 ||T||a(,|,^,) ^1^ 



The condition (|3.14|) is the condition, typical to the subject, that the 'average number 



of corners' in A exceed the number of 'trivial comers' in A. The second condition (|3.15|) is 



the all important uniformity condition. The second Lemma is the alternative if (|3.15|) does 
not hold. 

3.16 Density Increment Lemma. There is an absolute constant xfor which the following holds. 
Suppose that the corner-system in (|3.2[) is bA \ j-admissible, and that (I3.15P does not hold. Then, 



there are sets 

s'.cs,, r:,cr,„ rcT;= H S[^ 

These sets satisfy the estimates P(T' | T) > b^^^j. and P(A | T') > 5^ i r + ^H^j- 

It is the last estimate that provides a termination for our algorithm in §[lOl The previous 
Lemma, which is probabilistic in nature, does not supply us with admissible data. This is 
rectified in the next Lemma. 
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3.17 Unif ormizing Lemma. There is are functions 

^dim,^T : [0,lf ^N 

for which the following holds for all < v < 6 < 1. Let ^ be an corner-system as in (|3.2|) . Assume 
that P(A \ T) > 6 + V. There is a new corner-system 

^ = {H',S\,R[^,r ,A'\l<i,j<^ 

so that for some x e H, A' c A + x, and similarly for T cT + x. More importantly, we have: 

(3.18) dim(H') > dim(H) - ^^iJv, 6) 

(3.19) P(A' I r) > 6 + I 

(3.20) v^' is 6-admissible, 

(3.21) P(r I H' X H' X H') > Wt{6, v,¥{T\HxHx H)) . 

We remark that in (|3.18|) , if the dimension of H is too small, then ^' will be trivial in 
that T' consists of only one point. These Lemmas are combined in a standard way to prove 
our Main Theorem. The details are in § [lOl 



4 Box Norms 

It will be helpful to recall the Gowers uniformity or Box Norms in a more general form. 
In this we follow the the presentation in the appendices of [11], with most, but not all. 
Lemmas similar in statement to that reference. The notion of a Box Norm is critical to all 
the principal arguments of this paper; accordingly, we have pulled these general results 
together into their own section. 

4.1 Definition of Gowers Box Norms. Let {Xl,}„g^ be a finite non-empty collection of finite 
non-empty sets indexed hy u e U. For any V Q U write Xy := Hoey ^^j ^^^ the Cartesian 
product. For a complex-valued function fu : Xu — > C, we define the Gowers Box Norm (or 
just Box Norm) ||/u||nU(Xu) s R+ to be 

IIMl5'(X,) := E.o.J,eX. n C"'"'^W) 

where C : z i-> z is complex conjugation, and for any x^ = {x'^Jueu arid x]j = {xl)ueu in Xu 
and oju = {oJu)ueu in {0, Ip, we write x^ := {x'^")ueu and \oJu\ := Tjueu'^u- In the special case 
that U is empty, forcing /^ to be a constant, we have ||/ulln"(Xu) •= \fu\- 
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Above, we use the notation A^ for the class of maps from B into A, which notation 
will be used throughout the paper. liU = {u}, then ||/ullnU(Xu) = |Ex„/|- In particular this 
is non-negative, and can be zero. Note that if A c Xu, ||A|p„ is the average number 
of 'boxes' in A. Thus, \\A - P(A | Xu)\\nU(Xu) measures the degree to which A behaves as 
expected, in regards to the number of boxes it contains. It is also easy to verify that if A 
is a randomly selected subset of X^, then \\A - P(A | Xu)\\nU{Xu) is small. A similar point 
is essential to this section: Sets which are small with respect to this semi-norm behave in 
a manner similar to randomly selected subsets. A set A for which \\A - P(A | Xu)\\aU(^Xu) is 
small we will call uniform. 

The Box Norms arise through the following inequality, proved by inductive application 
of the Cauchy-Schwartz inequality. For this Lemma, see |lll Lemma B.2]. 

4.2 Gowers-Cauchy-Schwartz Inequality. Let U be non-empty, and {Xu}ueu be a finite collec- 
tion of finite non-empty sets. For every cou e {0, 1}^ let f^" : Xu -^ Cbe a function. Then 



(4.3) 



^^>h-^u n <^""'/r(^^") ^ n ii/riin"(x,) 



«ue|0,l|" 



^'ue|0,l|^ 



From this, it follows that one has the Gowers Triangle Inequality. 

Wfu + gu\\nU(Xu) ^ ll/ullnU(Xu) + IIS"ullnU(Xu) 

Indeed, raise both sides of the equation above to the power of 2'^' and use (|4.3|) . 
We will also refer to this corollary to the Gowers-Cauchy-Schwartz inequality. 

4.4 Corollary. Let {Xu}ueu be a finite collection of finite non-empty sets. For V c U, let fv : 

Xy ^ {z G C I |z| < 1}. Then, 

(4.5) |E,eXu n fv{Xv)\ < \\fu\\n^(Xu) ■ 

VcU 

That is, only the Box Norm associated to the largest set U is needed. Here, for x G Xu, Xy is the 
restriction of the sequence x = {Xu\u &U}to the set V dU. 

The inequality (|4.5|) is flTl (B.7)], and it suggests that the n^ norm is insensitive to 
'lower order' perturbations. We single out a more general inequality that is important to 
us. 

18 



4.6 Lemma. Under the hypotheses of Corollary \4.4[ for Vq c U, we have 

(4-7) \E,eXu Yl M^v)|<ll/yolln>'o(x^^)- 

VcU 

\V\<\Vo\ 

The inequality (|4.7|) has a proof similar to (|4.5|) , and we omit the proof. (Our proof of 
the von Neumann Lemma below could provide a proof, as we comment when we arrive 
there.) It has a similar interpretation to the first inequality: the D^" norm is insensitive to 
perturbations of the same order in distinct variables. 

4.8 Corollary. For all e > and all integers k, and finite sets U with \U\ > k there is a Ci = 
Ci{\U\,k,e)for which the following holds. 

Let {XJueu be a finite collection of finite non-empty sets, and Xy = Yluev^wfo'^' V '^ U. Let 
tlk he the collection of subsets of U of cardinality k, and for each V g tl}<_ let Sy c Xy satisfy 

(4.9) \\Sv - nSv)\\uvxy < [^{Sv)f , Ve^,. 

Then, we have the inequality 



(4.10) 



E 



[] Sy - f] Ex.Sy <eW Ex,S 



Ve'Uk Ve'Uk 



•>V ■ 



Thus, if all the sets Sy are very uniform with respect to the natural Box Norms, the 
expectation of the products of the Sy behaves as if the sets are randomly selected. 

Proof. We induct on the number w of elements of ^ G tlk for which Sy ^ Xy. That is, we 
prove that for all all e > 0, integers k, and 1 < w < |1/jt| there is a Ci(|Lr|, k, e, w) so that if for 



collections Sy, with at most w choices of y G tlk do we have Sy i^ Xy satisfying (|4.9|) we 



have (I4l0b . 



The case of w = 1 is obvious. Let us suppose that this holds for 1 < zy < |1/j.|, and prove 
the claim for k; + 1. We take 

C2 = C2(|Lr|,fc,e,K; + l) = K; + 3 + log2l/e + Ci(|Lr|,fc,e/2,K;). 

Considering the collections Sy for V G tl^, we select V^o so that P(SyQ) minimal. Thus, in 
particular we must have Sy^ c Xy^. Write Sy^ = F(Syp) + /y^. Since all the sets in tlk have 
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the same cardinality, we have the inequality 



E 



fvo n Sv\^\\fyo\yox.,^{lnSvo)f<lll^.veX.S 



xu^Xu 

The last line follows from the selection of V^o- 



-'XveXv'^V ■ 



Ve^lt 



VeUt Vetit 



We can the apply the induction hypothesis to estimate 

^Xu I i '^^ ~ I I ^XySv - 7 I I ^xveXySv 

+F{sv,)Ex^ n ^^" n ^^^^^ 

Vi 

ll^XySv. 



Ve^t-{Vo] Ve^lt-{Vo] 



<e 



Ve'U. 



U 



So the induction is complete. 

We can then conclude the Lemma by taking Ci(|L/|, k, e) = C2{\U\, k, e/l, \tlk\)- 

We frequently use this corollary of the Gowers-Cauchy-Schwartz inequality 

4.11 Lemma. Let {Xu]ueu ^^ a finite collection of finite non-empty sets. For V dU, let Sy c Xy. 
Then, for an integer k<\U\ 



(4.12) 



^xeXu W Sy(Xy) - W "ExveXv^viXv) 



VcU 
\V\<k 



VcU 
\V\<k 



< 2l^l • maxllSy - ¥.^yeXvSv\\nV{Xv) 

m<k 



Box Norms, the expectation of the products of the Sy behaves as if the sets are randomly 
selected. In order for this inequality to be non-trivial, we need 

max||Sy - ¥.^yeXvSv\\Dy{Xv) ^ 2"!"' j| B^^eXySviXv) 

\V\<k VcU 

\V\<k 

Of course, the Lemma is trivial iik = 1, and for k> 1, this uniformity requirement is quite 
restrictive if the sets Sy have small probabilities. This is exactly the situation in our proof. 
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Proof. We view 

(4.13) lE^eXu n 5^(^^) 



VcU 
\V\<k 



as a multi-linear form, with the order of the multi-linearity being Xti ( ) / ^ term which 
we have crudely estimated by 2'^' in (|4.12|l . For each set VcU, we consider the expansion 



of the function Sy as Sy = gv,o + gv,i where gv,o = ^{Sy I ^v) ■ ^v, and gv,i is the balanced 
function. We expand the term in (|4.13|) . Let I be the collection of subsets of A of cardinality 
at most k. We have 

^M = Yj ^-^"^^^^ n Sv,e{V){xv) . 

ee|0,lF ^cU 

\V\=<k 

The leading term arises from the choice of eo which takes the value for all choices of sets 
V. For this function we have 



^xueXu I I gV,ea{V)i^v) " || ^xveXySviXy) , 



VcU VcU 

\V\<k \V\<k 



which is part of the expression on the left in (|4.12[) . let Bi c A be a maximal cardinality set 



for which e(Bi) = 1. Then, for any subset VcU with |Bi| < |V| < k, we have e{V) = 0, so 
that gy,e{v) is a constant function, taking a value of at most one. It follows from (|4.7[) that 
we have 

|E.tu6Xu I I gV,e{V){Xy)\ < \ExueXu [ [ gV,e(V)iXv)\ ^ IIS"l^i4llnB(Xv,) • 

ycLf VcU 

\V\<k \V\<\Bi\ 



From this, (|4.12|) follows. D 



We note the following Corollary to the proof above, with the main distinction being that 
some of the functions are indicators of uniform sets as before, while others are arbitrary 
bounded functions. The conclusion is that the uniform sets matter little to the computation 
of the expectation. 

4.14 Corollary. Let {X„}„g^ be a finite collection of finite non-empty sets and let kbe a non-zero 
integer. Let "Vi and "Vz be two collections of subsets ofU, with all members of'Vi and "Vz having 
cardinality at most k. For V e 'Vi, let Sy c Xy. For W e ^2 let fw '■ ^w — > [-1/ 1] be « 
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bounded function. Then, 



E 



xeXu I I Sy(Xy) I I fwi^w) ~ I I ^xveXv^vi^v) X ^xueXu \ I fwi^w) 



Ve^- 



Ve^2 



V<=<V' 



Ve^2 



< 2'"l • max||Sy - B:,yeXvSv\\ny(Xv) ■ 



We turn to a more complicated version of these Lemmas and Corollaries. 

4.15 Lemma. Let U be a finite set, and Xufor u eU another finite set. Fix 1 <k < \U\, and let 'V 
be a collection of subsets of U of cardinality at most k. Let Su c Xu, and write 6 = P(Su). Assume 
that 

sup E,o .XuJfuixDWuvxy = T < 61^1-1 , fu:=Su-6. 

We emphasize that, in the expansion of the Box Norm above, the Box Norm is taken over the 
variables associated to V and the expectation is taken over all variables in U. The conclusion is 
that we have the inequality below. 



(4.16) 






6'^' - E,. n S"(^u) 



Ve-V 



< T. 



The implied constant depends upon \V\. Above, by very slight abuse of notation, we mean 



x^ = I 



xl V eV 
x° viV 



This is a 'conditional' version of Corollary 14.141 In particular, note that in (|4.16|) , we 
impose the Box Norms in the variables Xy, and take the expectation over all of Xu. The 
conclusion is again that if the set is suitably small with respect to a family of relevant Box 
Norms, then a range of products of these sets behave as if the set were randomly selected. 



Proof. Let us begin by noting that for V G "Y, the monotonicity of the Box Norms as the 
variables increase imply that 

E,oj6 - E,i^Sy(xj;)| < \\Su - 6hvx, < T . 
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It follows by the assumption on the magnitude of t that we can estimate 

<6I^I[(1 + t6-T'-1]<t 
Also note that we can estimate, using Lemma [4.111 

E,oJe,i^ Yl Su(x^) - Yl E,i^Su(x^)| < E,o^ sup||Sa(xy) - E,i^Su(x 

Putting these inequalities together proves the Lemma. 



<2t. 



D 



5 Linear Forms for the Analysis of Box Norms 

Box Norms, and counting corners in sets are examples of multi-linear forms that we will 
work with. Their analysis will lead to forms in as many as 24 functions, leading to the 
need for some general remarks on such objects. Moreover, we are analyzing these forms 
on objects that are far from tensor products. This is the primary focus of this section. 

We will be making a wide variety of approximations to different expectations. In order 
to codify these approximations, let us make this definition. 

5.1 Definition. Fix < f < 3"^^ be a small constant. For A,B > we will write A = B if 
|A - B| < vA. (We stack a 'u' on the equality, as this relation will always come about from 
uniformity.) In those (few) instances, where it is important emphasize the role of v, we 
will write A = B. 

We will only use the notation for quantities between and 1. Observe the following. 
Let < A, B, a, j6 < 1. If A =" ct and B =" (i, then we have 

\A-a-^\<\A-aB\ + a-\^-B\ 
<vA + avB < 3vA . 
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Thus, we can write A = a ■ fi, that is this relationship is weakly transitive. We will need 
to use a finite chain of inequalities of this type, with the longest chain associated with the 
analysis of a 28-linear form in Lemma 17.251 below. By abuse of notation, we will adopt 
the convention A = B and B = C implies A = C. This transitivity will only be applied a 
finite number of times, so that taking an initial v in Definition 15 . 1 1 will lead to a meaningful 
inequality at every stage of our proof. 

A second situation we will have is this. Suppose that A = A' and B = B'. Then, 

\AA' - BB'\ <\A- B\A' + \A' - B'\B 

< v{AA' + A'B) < 3vAA' . 

Thus, we can write AA' = BB' , thus this relationship is weakly multiplicatively transitive. 
We will need to use a finite chain of these inequalities, mostly related to computing 
conditional expectations. By abuse of notation, we will adopt the convention that A = A' 
and B = B' implies AA' = BB'. This observation is closely linked with the fact that our 
definition of admissibility. Definition 13.41 includes relative measures of uniformity. 

Our Lemmas and Definitions should be coordinate-free, but to ease the burden of 
notation, we state them distinguishing the coordinate X4 for a special role. They will be 
applied in their more general formulations, which are left to the reader. 

We are concerned with the evaluation of certain multi-linear forms, especially those 
associated with Box Norms. For a collection of maps Q c {0, , . . . , A - Ijl^'^-^l, where A > 2 
is an integer, let {fa, | cu G Q} be a collection of functions. The linear forms we are interested 
in are 



L(/<D I ^) - '^x[^^eSi,2,3,0<e<2 _[ J_/«(^l,2,3) • 



1,2,3 

wed 



This next definition is concerned with the uniform evaluation of forms of this type, 
where the fa, are particularly simple. 

5.2 Definition. Let A > 3 be an integer, and < 5 < 1. A subset U cT^is called (A, 5,4)- 
uniform if the following holds. Set Qs^a = {0, . . . , A - Ijl^'^'S) . For any subset Q c Qs^a we 
have the inequalities 

(5.3) LQ(U|n) = [040ij|4j [[ o.j^ 

l<j<k<3 
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Here, 6u\4 = F{U \ T^). That is, the percentage error between the two terms is at most d. 

It is an important point that we index this notion on the number of linearities that we 
permit the form to have, as we must provide an upper bound on this notion of complexity. 
Our primary objective is that Tbe well-behaved with respect to the Box Norm, in particular 
that Lemma [83] holds. This will require that T be (4, ^i,4)-uniform, where ^i is specified 
in that Lemma. But this will in turn require us to require T4 is (12, ^2/ 4)-unif orm. It is one 
purpose of this section to explain this relationship. See Lemma ISill 

While we will use these results several times, there are two points where either these 
results apply, but would lead to an increased order of complexity, as in the proof of (|7.31|) , 



or the results of this section are not stated in enough generality, as in the proof of (|8.23|) . 
A full understanding of these issues would likely be an aid to extending this argument to 
higher dimensions. 

In this definition, examining the product of densities, we see that 6[i 4 = T{U \ T^) has 
the power |Q|, that is the total number of terms in the product. The power on the density 
dj^k is the number of distinct maps of the form tu, restricted to {;', k} in the set Q. To set out 
an example, a typical term to which we will apply this definition is to the set U = T^^, in 

E X16S1, W T4(Xi,X^3) 

Here, it is clear that |Q| =4, while 

|{a)|(i,2) I Oil = 2 , |{a;|(i,3| | n}| = 2 , |{a;||2,3| I Oil = 4 . 

The parameter ^ appears on the right in (|5.3|) , and represents how close, in terms of 
percentages, the expectation behaves with respect to its expected behavior. 

A set U is (A, ^,4)-uniform if a wide set of expectations of U 'behave as expected.' It is 
hardly obvious that even the set 74 satisfies this definition, but it does, and we prove in 
Lemma l5!4l that both T^^ and T are uniform. 

5.4 Lemma. ]Ne have the following two assertions. For constants Ci > Q > that depend only 
on Cadmiss iT^ Defiuition^Mihe followiug are true. 

1. For ^ = (5^", J , the set T4 is (12, d,4)-uniform. 
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2. For d = 6j^. J. , the set T is (6, &,4)-uniform. 

In fact, Ci, Co can be taken to be a small constant multiple of Cadmiss- 

As the statement of the Lemma indicates, there is a link between the complexity of the 
linear forms we need to consider for T and 74. 



Proof Let us discuss T^ first. Note that by (|321» and (l47|) , 

6<f<ll oieQ. 

(5-5) = ^A,,-s,,. n ^^(<*'' + <''' + ^f'^ n ^/'^(^m) 

6<f<ll coen l<j<k<3 

(5.6) = ^l"' • E<.3^s,,3 n n Sj,,{x^,) + 0{F{T\HxHxHf -----''). 

b<e<n aieQ i</<;c<3 

The power on P(T \ H X H X H) accounts for the fact that implicitly the condition (|3.7[) is 
an expectation over H, while above we are taking integration over Si 2,3. 



We continue with the analysis of the expectation above. We can use (|4.7[) and (|3.6[) to 
estimate 

(5.7) E,.^,eS,,,3n n 5m(^P= n 5;;"'''^''""+0(P(T|Si,2,3,4)"^^-'-)- 

6<fell wen l<j<k<3 l<i<k<3 

The leading terms of the expectations are exactly as desired. The two error terms in (|5.6|) 
and (|5.7|) should be as small as desired, namely that they contribute at most ^L(T4 | Q). 
But it is straight forward to see that we can take Co of the Lemma to be Cadmiss - 12 - |Q| > 
Cadmiss - 12 - 3^^ with 3^2 being the cardinality of 03^12 = {0, . . . , ll}li'2'3>. 

We turn to the second conclusion of the Lemma. Let Q c Qa^e/ and consider the 
multi-linear expression L(r | Q). Each occurrence of T is expanded as T = /i + /o where 
/i = 6r| 4^4. The leading term is when each T is replaced by /i, which leads to 6!^' times 
the expectation in (|5.5|) . There are 2'^^' - 1 terms remaining. Each of them has an occurrence 
of /q. All of these terms can be controlled by the assumption (|3.5|) , and importantly, the 
inequality (|5.20[) below. (We have not yet proved (|5.20[) , part of Lemma 15.171 but its proof 
is independent of this argument.) This last Lemma is applied with A = 6,V = T4, which as 
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we have just seen in the first half of the proof, is (12, d', 4)-uniform, for a very small choice 
of d'. This gives us 



|L(r I Q) - 6fl^ L(r4 I Q)| < 2l"l+i L(T4 | Q) 



ll/ollniA3Si,„ 



\\Ti\\n^,2,3Si,2,3 

< 2l"l+^6j^|^f=^ • L(T4 I Q) . 
And this completes the proof. n 

Here is a corollary to the previous Lemma that is certainly relevant for us. 
5.8 Lemma. We have this estimate 

a)e{0,lll'2'3 

a)6(0,lli'2'3 l<i<k<3 

=fI*^*^ n '%■ 

7=1 1</<A:<3 

We return to general considerations, and make a remark that we will refer to several 
times. Let ^ c 74 be (A, ^, 4)-uniform. Let Q c Qs^a-I/ and assume that the set Qi^o is 
non-trivial. 

Qi^o = {tu e O I co{l) = 0} , Oi^o = O - Oi^o • 

Consider the estimate below obtained by applying the Cauchy-Schwartz inequality in all 
variables except x°. 

(5.9) L{V I Q) < [L(ni^o) • U2f'^ 

u2=iE II ^4:2,3) • Kjes, n n ^^2,3)!'- 

Use (|7.11|) to write the last term as 1/2 = L(V' | Q^), where we define 

-(/)=h, '=' 

[oj{]) ; = 2,3 

(5.10) Q^ = Q1/.0 U {a;,aJ I cu G Qi^o} • 
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5,11 First Proposition on Conservation of Densities. IfVc T^be (A, d, 4)-uniform, Q c 
Qs^A-i/ wzf^ the notation in (|5.9[) — (|5.10|) we ^flue f/ze equality 



(5.12) L(y I Q) ""'= UV I Oi^o)'^' • HV I 0^)1/2 _ 

Proof. The proof is almost trivial. Each cu G Q on the contributes 1 to the densities 
6y 1 4, 64, bj^k for 1 < y < fc < 3. If a;(l) 9^ 0, it contributes to both terms on the right, so the 
square root makes contribution 1. If a){l) = 0, then it contributes nothing to L{V \ Qi/>o)/ 
but contributes 2 to the other term L(y | Q^). D 

The previous Lemma plays a decisive role in all our applications of the Cauchy- 
Schwartz inequality, to prove our weighed versions of these inequalities. This Conser- 
vation of Densities has an essentially equivalent formulation, also important to us, that we 
give here. With the notation of (|5^ — (|5.10|) , set 

(5.13) ZPi^o : Oi-o] = ^oes, JJ ^^as) 

5.14 Lemma. Let A = 1, . . . , 6. Suppose that the set V c T^ is (A, d, A)-uniform, where d < 
T{V I T4)^''^' . Then, for all choices o/Q c Q3^;i_i as above, we have 



Var,.,n(z[Oi/.o : Oi^o] I f] ^(^1^3)) 



(5.15) 2 

<KV^-[E(Z[ni^o : Oi^oll n ^^2,3))] • 

Here, K is an absolute constant. 

Of course the conditional expectation of Z can be computed. 

Proof. We use the standard formula for the variance of a random variable W supported on 
a set y. 

(5.16) Var(W | Y) = P(Y)-^EW^ - (P(y)-^ • EWf 

The conditional variance will be small if we have 

e(Z[Qi^o : Oi^ol'l n ^K,2,3)) = e(Z[Qi^o : Oi^o] I H ^^2,3))^- 
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But this is a recasting of (|5.12|) . Namely, using the notation of (|5.12[) , we can write the 
equation above as 

L(y|Qi) „ L(y|n)^ 

L{V I Qi^o) ~ L(y I Qi^o)' 



which is (|5.12|) . 



n 



We are interested in refinements of the Gowers Box Norms, in which we estimate L in 
terms of a Box Norm of one of its arguments, but do so in a more efficient manner, just 
as in the proof of Lemma 13.131 which is presented in § |7l For this Lemma, let us consider 
selections of /^ where /^ G {/, V}, and / is a fixed function supported on V and at most one 
in absolute value. In application, / is a balanced function. 

In this Lemma, we will single out the first and second coordinates for a distinguished 
role, which is done just for simplicity. 

5.17 Lemma. Let A = 2, . . . , 6. Suppose that V is (2A, d, 4)-Uniform, where S < W{V \ 74)^'^' . 
Let Q c Qs^A/ where the value of A is half of the uniformity assumption imposed on V. Let {fa, \ O} 
be a selection of functions which are either equal to V or a fixed function f which is supported on 
V and hounded by one in absolute value. (In application, f will be a balanced function.) 



1. Suppose that there is an coq e D. with /^(, = /, and coo{l) i^ co{l)for all other co e Q with 
fa, = f. Then, we have the estimate 



(5.18) 



|L(/^|Q)|<2L(y|Q) 



0{d) + 



Ex2A36S2,3ll/llnlSi''^^^ 



E 



'^2,^3652,3 



niSi 



2. Suppose that there is ancoo e D. with fa^^ = f, and {coo{l), tUo(2)) i^ (tu(l), co{2))for all other 
CO e Q. with fa, = f. Then, we have the estimate 



(5.19) 



|L(/JQ)|<4L(y|Q) 



E 



0(^) + 



'^36S2,3II/IIq1,25j2 "" 



Ex3eS2,3ll^ll',:,2s^^ 



3. If there is at least one cuq e O with fa,g = f, we have 



(5.20) 



|L(/^|Q)|<8L(y|n) 



E 



0{d) + 



^3652,; 



3tU ll|-|l,2,3Si2 



il/8 



lE.r3eS2,3ll^llni,2,3Si2,3 
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Of course the estimate (|5.20|| applies in the first two cases of the Lemma. But we will be 
in situations, in the proof of Lemma [83l where we do not wish to use the estimate (|5.20[) . 

We remark that one could read the proof of Lemma [3.131 in §[7| before the one below. 
This proof in §[7|is independent of the proof below. It treats a more complicated situation, 
in that all the Tj have to be considered, but is only discussed in a single concrete instance. 

Proof. We can read off a good estimate for L{V \ Q) from (|5.3|) , in all cases (1) — (3) above. 
For each of the three cases, we assume that the choice of coq specified in each of the three 
cases satisfies coq = 0. 

In case (1), we will apply the Cauchy-Schwartz inequality in all other variables. To set 
notation for this, let 

Qi^o = {tu e Q I co{l) = 0} , Qi^o = {coeO.\ co{l) + 0} , 

and let X' = {x^ | 1 < ; < 3, < ^ < A - 1} - {xj}. Then, we apply the Cauchy-Schwartz 
inequality to estimate 

(5.21) |L(/„ I Q)| < \l{V I Qi^o) • Wi]'^' 

(5.22) Wi = E,.,x' n ^(^tks) E.;es, \\ ^(^^2,3) 



lii'eDi^o 



liieQi-,! 



1^0 



We continue the analysis of Wi. It follows from the assumption in part (1) of the 
Lemma, that ojq e Qi, and fa,^ = f, but for all other choices of cu G Qi^o we have /a, = V. 



In order to expand the square of the expectation, using (|7.11[) , let us define a new class of 
maps as follows. For co G Qi, define 

' A ; = 1 



(5.23) 






Notice that Qji)^jo,a-i) = {tuo / <^o}r by assumption on Q that holds in this case. 
Here and below, we are expanding the set Q. We take fa, = V for all co ^ CI. 
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We can write 
(5.24) Wi = E,.,x'lE44eSi n ^(^lis) JJ UKi,^) 



^,\eS2,3 



where the last term is defined in (|5.13|) 



It follows from Lemma l5.14| that Z[Q|i)^|o,a-i) : ^^ - ^|ihio,a-i)] is essentially constant 
on V{xXl^^)V{xXl^^). Namely, 



E(z[n,iH(o,A-i) : n' - 0(1^(0,,-!)] I V(x^,^,3)^«;2,3) = 



L(y I Qi) 



UV,v\^w^m-i]) 



The implied k in the '=' is k = V^, see Definition 15.11 Similar comment applies to other 
uses of the the symbol '=' below. And the variance of Z[Qji|^jo,a-i| : ^^ - ^|ihio,a-i|] is 
very small. Note that L(y, V \ Qji|^jo,a-i|) = E;c2,^36S23ll^lPic / we can estimate 



(5.25) 



Wi < TL{V I Qi) 



O(V^) 



+ 



E.T2,X36S2,3ll/llnlSi 

Ex2,x3eS2,3ll^llniSi 



We combine (|5.21[) — (|5.25|) , to conclude that 



|l(/, I n)| < 2[L(y I Qi^o) • Uv I n')]'^' x 



o(V^) 



E 



^2,3;36S2,3 



+ 



E 



3^2,3:3652,3 



And so the proof of (|5.18|l will follow from the inequality 

UV I Oi^o) • L(y I Q^) < 2 l.n{V \ Of . 
This is Conservation of Densities Proposition, Proposition 15.111 



2 nl/2 



2 
niSi 



We turn to the proof of the second part, namely (|5.19|) . The initial stage of the argument 
follows the lines of the argument above. Namely, we use the estimate (|5.21|) and (|5.22|) . 
The term Wi is expanded as in (|5.24|) , with the same notation that we have in (|5.23|) . But, 
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under the assumptions on Q that hold in this case, Qji)^jo,a-i) need not consist of just two 
maps CO. 

We apply the Cauchy-Schwartz inequality to Wi. To do this, we make these definitions, 
recalling that Q^ is defined in (|5.23|) . 

^2^0 = {^ e n^ a){2) ^ 0} , Q^^q = {w g Q^ (o{2) = 0} , 
X" = {x^n < ^ < A} U {x^ 1 1 < ^ < A - 1} U {x^ I < ^ < A - 1} . 

Here, the point is that the only variable omitted from X" is x^. Then, we can estimate 

r 1 ii/2 

(5.26) -- ' '■' 

(5.27) 



Wi < [l{V I Qi^o) • W2] 



ojeQ, 



'2,^0 



a)6D2-,o 



To expand the square in the definition of W2, we set 

^(;) = H ^*' 

' \A ; = 2 
0^^;i = {w I w G Q^^ol , Q2 = Qi^Q U n^^o U Q^^^^ , 
O|UH|0,A-i| = {a; G n^ I a;(l),a;(2) G {0,A - 1}). 

Observe that Q|i^2H|o,a-i| = {tt'o / ^0 / tuo / / ^o}- Then, we can write 

(5.28) W2 = E,.,Y. [] /(x^"2^3)xZ[n,i,2M0,A-i) : n'-Q|i,2H,o,A-i|]. 

'^6Q|i,2H|0,A-l| 



where Y" = {x°,Xj,X2,X2,X3}, and Z[Q|i^2H|o,a-i| : ^^ - ^|i,2H|o,a-i|] is defined in (|5.13|) . 
(We assumed that cuq = 0.) 



Using Lemma |5.14[ and the the assumption of (2A, ^, 4)-uniformity on V, we can esti- 
mate 

Exf6Y"(2[^(l,2)^{0,A-l) : ^ - ^(1,2)^|0,A-1)] I II ^(^t2,3)) 

(DeQ|i^2|^|o,A-ii 

. L(y I Q^) 

L(y I Q(i^2H|o,A-i)) 
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and the conditional variance of Z[Qji^2H|o,a-i) '■ ^^ ~ ^|i,2H|o,a-i)] is very small. Thus, we 
can estimate 



(5.29) 



W2 = 2 L{V I n^) X [0( V^) 



E.o,s3l|/|| 



+ 



ni'2s 



^xOeSs 



li]. 



'1,2 



Combining (|5^2T1) , dS^D, dS^D), (IS^lTl) , and ((S^D, we see that 



|l(/,, I n)| < 2L(y I Qi^o)'/' • L(y I n^^o)'^' • Uv 1 0^)1/4 



X 



lE.o,s3ll/ir,..s,, ^^^' 



^:^06S3 



ni.2Si,2-' 



The last step in the proof of (|5.19|) is to verify that 

L{V I Qi^o)'^' • HV I "2/.o)'^' • L(^ I "2)1/4 < 2L(y I Q) . 
This is again the Conservation of Densities Proposition, Proposition lS.llI 



We turn to the third point of the Lemma, namely the inequality (|5.20[) is true. We can 
use earlier parts of the argument. Let us combine (|5.21|) , (|5.24|) , (|5.26|) , and (|5.27[) . We have 



(5.30) |L(/,, I a; G n)| < 2 L(y I D^^of^ ■ L{V \ nl^,f' ■ Y^f , 

where W2 is defined in (|5.28|) . 

The strategy is to repeat an application of the Cauchy-Schwartz inequality in all vari- 
ables except X3. To do this, we define 



n?,.„ = {w G n^ I a;(3) + 0} , n^^o = {a; G n^ I a;(3) = 0} , 



)2 

'37^0 l"^ -- ^^ \^\^) -r- ^\ , ^-^S^O 



= {xj I ;■ = 1,2, < ^ < A} U {x^ I 1 < ^ < A - 1} . 
Here, the point is that the only variable omitted from X'" is x^. Then, we can estimate 



(5.31) 
(5.32) 



W2 < [l(V I Q^^„) • W3] 



^'6n3/,0 



^e^^o 
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In the product over O^^q, it is important to observe that if fa, = f, it must follow that 
ioj{l), co{2)) G {0, A}^'^. For if this is not the case, an earlier step would have switched f^, to 
V. 



To expand the square, we define 



coij) = 



h(;) ;^3 



I A ;■ = 3 

Qg^A = {W I W G Q^ a){3) = 0} , Q3 = Q2 U Qg^A , 

O|lA3M0,A)={0,A}fl'2'3l. 

Then, we can write 



coed 



|1,2,3H|0,A| 



■illASHIO.AI 



L{V I Qji^2,3H|0,A|) 



Now, the term Z is nearly constant, by Lemma [5.14[ and we have 
E(^Z[Qji^2,3H(o,A| : ^ - ^|i,2,3H|o,A|] I II ^j = -j-TT 

a)6Qii 

Therefore, we can estimate 
(5.33) 






Combine (|5.30|) , (|5.31|) , (|5.32|) , and (|5.33|) to conclude that 



|L(/„ I a; G n)| < 2L(y I Qi^o)'/' • HV \ Cll f' ■ L{V \ Q^ )i 



/8 



X uv I n^)'^' 



0{^f^) 



+ 



□l'2'3Si,2.3 



□l'2-3Si,2,3 



1/8 



Therefore, it remains for us to check that 

L{V I Qi^o)'^' • UV I O^/.o)'^' • L(^ I ^l^of' ■ UV I n'f^ < 2L(V I Q) . 
This again follows from Proposition 15.111 



D 
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6 Linear Forms for the Analysis of Corners 

In this section, we reprise the initial portion of the previous section, though our needs are 
not quite a significant. For the uses of this discussion, let us make the definition 

r^= n ^'''- 

l</<fc<4 

This is the same definition as for Te, but the set Sf is missing. 

For Q c Q4^A/ where A < 3, and choices of functions Fa, & {Te , Tg \1 < { < 4}, we have 
the linear form 

AiFa,\0) = B,.^^^,s,,,,llFUxl2,3,d- 

' b<A<3 a>eO 

Here, any Sj that occurs in this expectation is composed with Aj. Our first Lemma states 
that we can easily estimate the values of these forms. 

6.1 Lemma. For Q and choices of Fa, as above we have 

e=i i<i<k<4 

O(^) = \{co I Fa, = Te}\ , W{i,k) = \{co\j,k \coeQ}\. 

In the last display we are counting the number of distinct maps there are when co is restricted to the 
sets {i,k}. 

Proof We have 

4 

n ^-(^,2,3,4) = n n ^^ ° ^^^,3,4) >< n n ^^-'^ ° ^^^1^^ 

where ^p{£) = {co \ Fa, = Tf}, and xl>ij,k) = {oj\j^k I tu s Q}. The Lemma then follows from the 
assumptions of admissibility, namely (|3.7|) and (|3.6|) , with application of (|4.5|) . n 



We need an analog of the Conservation of Densities Lemma, Proposition 15.111 Let 
Q c 04^3, and assume that for the set Oi^o below is not empty. 

Qi^o = {a; G n I (v{l) = 0, f„ ^ %} , Qi^ = Q - Qi^o • 
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Here, we exclude Ti, as its expectation does not include any 61. 

Consider the estimate below obtained by applying the Cauchy-Schwartz inequality in 
all variables except Xy 

(6.2) A(f , I Q) < [A(ni^o) ■ U2f^ 

U2 = lEll FMl2,3,d-\^4eS, n n ^-^2,3,4- 

Use (|7.11[) to write the last term as U2 = A(f ^ | Q^), where we define 

•"^icO) ; = 2,3,4 

(6.3) Q^ = Qi/^o U {w, oJ I tu G Qi^o} • 

And we define Fj^j = f^. 

6.4 Second Proposition on Conservation of Densities. IflfQc Qs^a-I/ with the notation 
in (|6.2[) — (|6.3|) we /za^e f/ze equality 

(6.5) A(f ^ I Q) = A(f ^ I Qi^o)'/' ■ A(f . I O')'^' - 

Proof. Each cu G Q be such that it contributes 1 to the density 5e, for 2 < ^ < 4 on the 
left-hand-side of (|6.5|| . Thus, a; G Qi/>0/ and it contributes a 1/2 to this same density in 
each of the two terms on the right-hand side. Let co G Qi^o- Then, it contributes a 1 to the 
density of 6\ on the left-hand side, while on the right hand-side, there is no contribution 
from the first term, while the second term contributes a 2 • 1/2 = 1, since the there is a new 
variable x^ 

If one considers a density dj^k where 2 < ; < fc < 4, it is accounted for much as the case 
of 62 above. And a density d^j, with ; = 2, 3, 4, is accounted for as is 61 above. n 

This Conservation of Densities has an essentially equivalent formulation, also impor- 
tant to us, that we give here. With the notation of (|6.2[) — (|6.3|) , set 



Z[Qi^o : Oi-o] = E,o,s^ [] f.,«,2,3,4) 



a)6Qi_,o 
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6.6 Lemma. For all choices o/Q c 04^3 as above, we have 
Var,.,^(Z[ni^o : Oi-o] I H ^-^2,34)) 

<K^/^■[B(Z[n^^o : Oi^o] I [] ^-^2,3,4))]'- 
Here, K is an absolute constant. 

Of course the conditional expectation of Z can be computed. 
Proof. We use the standard formula for the variance of a random variable W supported on 



a set y given in (|5.16|) . The conditional variance will be small if we have 



e(Z[Qi^o : Oi^ol'l n f aX4;2,3,4)) = e(Z[Oi^o : Oi^o] I ]\ F,,{xl^^,^S) . 

But this is a recasting of (|6.5|) . n 

There is a variant of the inequality (|5.20[) which holds. Let us formulate it. 

6.7 Lemma. Let Q c 04^3, and let F^^ £ {Ti, T2, T^, Tn,]. Let f^ be a choice of function satisfying 
\fa_\ < F^,. Then, we have the following inequality. Suppose, for the sake of simplicity that for 
cuo e n we have F^^ = Ti 



(6.8) |A(/^ I n)| < 2|A(f^ I Q)| X |t; + 



ll^olln2,3,4H2,3^,^l/S 



In view of the fact that we have the Second Conservation of Densities Proposition, 
Proposition 16.41 and the variance principle Lemma 16. 6[ the proof of this inequality is 
just an iteration of the proof of (|5.20[) above, as well as the proof of Lemma 17.11 below. 
Accordingly we omit it. 
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7 Proof of the von Neumann Lemma 



This is a careful application of weighted Gowers-Cauchy-Schwartz inequality, which does 
not seem to follow from any standard inequality in the literature. The primary difference 
with the weighted inequalities of the work of Green and Tao, ||8||11| is the absence of 
the von Mangoldt function with it's uniformity properties, a difference overcome by the 
enforced uniformity, an argument invented by Shkredov IITSl . 



In our setting, the sets X^ will most frequently be H, the copy of the finite field. The set 
U will for the most part be {1, 2, 3, 4}, though there are larger sets U, as large as 24 elements, 
that occurs in the analysis of different terms below. 

We introduce the following 4-linear form. For four functions fj : H X H X H — > C, for 
1 < ;■ < 4, define 

Qiflr flf f?,r fi) = ^y,XjeHfi{X\,X2,X3)f3{Xi,X2,X3 + y) 
1<;<3 

X /2(Xi,X2 + y,X3)fi{Xi + y,X2,X3) 

If A c H X H X H, it follows that Q{A,A,A,A) is the expected number of comers in A. It 
is an important remark that this is defined as an average over copies of H, whereas earlier 
sections have been defined over e. g. Si,2,3,4. This fact introduces extra factors of 6e below. 

We are deliberately choosing a definition that is slightly asymmetric with respect to the 
subscripts on the fj on the right above, to make the next display more symmetric. Using 
the change of variables y = x^ - (xi + X2 + X3), this is 

4 

Q(/l,/2,/3,/4) = E..^,eHrT/;OA^., 

A;(xi, X2, X3, X4)= Y^XkBk, 1 < ; < 4 . 

k : ktj 

The point which dominates the analysis below is that the functions fj o Ay is a function of 
{X{\1 <£ i^ i < 4}, i. e., is not a function of Xj. 

We will write, by small abuse of notation, Ai(x!j"2 34) = ^23 4- ^^^ ^^ allowed, as \i{x'^^ 3 ^ 
is not a function of x^ . This will allow us reduce the complexity of some formulas below. 
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We codify the result of the application of the proof of the Gowers-Cauchy-Schwartz 
Inequality for the operator Q into the results of the following Lemma. This technical 
result codifies the results that we need to understand about the set T, and A to conclude 
Lemma I3.13[ 

In this Lemma, we single out for a distinguished role the function that falls in the last 
place of Q, but there is a corresponding estimate for all the other three functions. 

7,1 Lemma. Let Tj either be identically T, or Tj = Tjfor all 1 < ; < 4. Let fj : Tj — > [-1, 1] be 
functions. We have the following estimate. 

(7.2) |Q(/i,/2,/3j4)| < U;/^-U^/^-Uf -Uf , 

(7.3) Ui = Ui(Ti) = E,,2,x3,X4eHri(X2,X3,X4) 

(7.4) U2 = U2(T2) = E,o^,o,H n ^2(4,3,4|) ' 

{7.5) U3 = U3(T3) = E ,o,H ]\ T3(x;^i^2,4|)' 

(7.6) U4 = U4(/4, Ti,T2, T3) = E,o^^^,,. ^^^,^,^^3, Z • W f,{xl,^,^) 

we|0)liA3lx|0)l«l 
3 

(7.7) z = z(Ti, T2, T3) = E,o,H n n ^^- ° '^i<<,2A4) 

a)e|0,l|li'2'31x(0|l«l 7=1 

This Lemma makes it clear that we need to understand the linear forms Ui, U2, U3, and 
Z for both the Tj and for T. 



7.8 Remark. The presence of the term Z in (|7.14|) can be seen in the argument of 1.15.1 , but 



it is not needed in Shkredov's approach [18J. However, this term is much more subtle in 
the three dimensional case. Similar terms will arise in §|8l are dealt with systematically in 
Lemma 15. 14[ 

Proof. The method of proof is to follow the proof of the Gowers-Cauchy-Schwartz in- 



equality, especially in the case of (|4.7|) , but keeping track of the additional information that 
follows from terms that are neglected in the usual proofs of this inequality. All earlier 
applications of the Gowers-Cauchy-Schwartz inequality has in some sense 'lost units of 
density' In the present argument, we recover these lost units by the mechanism of the 
various functions of T that appear in the definitions of Ui, U2 and U3 above. 
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Estimate the left-hand side of (|7.2|) by 

(7.9) IQ(/l,/2,/3,/4)l<[Wl-Wl]' 

Ui = ^x2,X3,XieH\fl ° All < Ex2,X3,XieHTi{X2, Xj, Xi) , 



ll/2 



(7.10) 



Lfl,2 - E;f2,X3,X46HTl(Xj2,3,4|) 



^^1 ||/eO)° A/Xji^2,3,4} 



/=1 



We use the Cauchy-Schwartz inequality in the variables X2, x^, X4. The term in (|7.9|) proves 
(|7.3|) . In the last line, we are using the notation of the general Gowers-Cauchy-Schwartz 
Inequalities, so that Xji,2,3,41 = (^1/^2/^3/^4)- This will be helpful in the steps below. 

For Ux2, we use the elementary fact that 
(7.11) Exexgixpy^rfix, y)f = E xex g{x) T] fix, f) . 

This is in fact crucial to the proof of the Gowers-Cauchy-Schwartz inequality. In particular, 
it is essential that we insert the 'ri(Xj2,3,4)) on the right in (|7.10[) . Thus, 



Ui,2 - Ex°,x°yX°^eHTl{Xl2,3A]) [[ [[ feij) ° A/xj^j 2,3,4)) • 

(iie|0,l|'i'x|0|l2A4| ;=2 



x°^,x\eH 



We refer to this identity as 'passing Xi through the square.' With this notation, it is clear 
that the variables X2/X3/X4 will also need to 'pass through the square'. 

Thus, we write as below, using the Cauchy-Schwartz inequality in the variables x°, xj, Xg, 
and x°. 

1I/2 



(7.12) 



(7.13) 



U,,2 < [U2 ■ 1/2,2] 

^2 < ^xl^eH n ^2 ° MX'([^2,3A\) 

x°,xleH &;e|0,l)lilx|0|l3''*l 



U2,2 - ^x°,x°eH 



[ [ '^2(x|i,3,4)) 



xy,x}6Ha)6|0,l|'ilx|0)l3'*l 



X 



E 



'X2eHTi{X[2,3A]) II II fe(j) ° A;(xj"i,2,3,4)) 

a)6(0,l)l^lx(0)l2'3'«l 7=3 
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The term in (l7l2l) is dZll). 

For the term (|7.13|) , we write 



U2,2 - E x1,xleH [ [ 

^l'l,2|'^a2|eH|l,2|'^6|0lll-2lx|0ll3'41 



'^2(^j'l,3,4|)Tl(x^34j) 
4 
7=2 

We estimate using the Cauchy-Schwartz inequality in the variables x° 2/ ^^ 2 ^^'^ '*^4- 

r ii/2 

U2,2 < [U3 ■ 1/3,2] , 

U3 = E ^og^ 1 1 7^3(^^,2,41) 

E^3 II ["^2(^1^1,3,4) )7'l(Xp,3,4)) 

a)e|0|li'2lx|0)l3l 



L/3 2 — E^o -.1 pu,, ,, 



XieH 



X ^3(x;^i,2,4))/4 ° A4(x;"i 2,3))] 



The term U3 is (|73I . 



We write 1/3,2 as follows, after application of (|7.11[) , and recalling the definition of Z in 
(EZl). 



(7.14) 

This completes the proof, 



L/3 2 — E^O yl efj 



.31 ^ ■ n -^4 O A4(X[; 2,3,4|) 



a,6|0)llA3lx|0)l4l 



n 



We now provide the estimates that the previous Lemma calls for, in the case of the sets 



Ti- 



7.15 Lemma. For the terms Ui, U2, U3 and Z as defined in (|7.3|) — (|7.5|) and (|7.7[) , and Tj = Tj zve 
have these estimates. 

(7.16) Q(Ti, T2, T3, T4) = Ui(Ti)i/2 . U^{T2)"^ mW" ■ IJ,{T,, T3, T2, T,f^ . 
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The constant S in the definition of=, see Definition \5.2[ can he taken tohed = W{T \HxHx H)^, 
where C is a large constant, depending only on Cadmiss i^ Definition \3.4\ And for Z{Ti, T2, Ti,), we 
have this inequalities on conditional variance. 

(7.17) Var(z(Ti, T2, T3) | JJ T^(^n,2,3])) ^ ^^(^ \HxHxHf. 

(^e(0)liA3ix|0)lil 



Proof. The first claim (|7.16|) follows from (an iteration of) the Second Proposition on Con- 
servation of Densities, Proposition 16.41 The second from Lemma [6^ n 



The content of the next Lemma is that in the case where A cT has full probability, that 
A has the expected number of corners. 

7,18 Lemma. Let ^ be an admissible corner system. Then, we have 



(7.19) 



Q(T, T,T,T) = f\6TieX Q{T^, T^, T,, T^ . 



f=i 



Here, the constant d implicit in the = can be taken to be S = x'e, where these two constants are 
determined by Kadmiss '^'^d eadmiss ^n Definition 13.41 and can be made arbitrarily small. 



Proof. One considers the expression in (|7.19|) is a 4-ltnear form, and expand T as T = /y 1 +/;,0/ 
where fj^i = 6t \ fTj. This leads to an expansion of Q(T, T, T, T) into 2^ terms, of which the 
leading term is 



Q(/i,i J2,i, Ai. Ai) = n ^^1 r Q^^i' ^2' ^3, ^4) , 



The remaining 2^^ - 1 terms all have at least one fj^. We can show that all of these terms 
is at most a small constant times the expression above by appealing to (|3.5[) and (|4.7|l . In 
particular, we show that we can estimate 



(7.20) |Q(/u(i),/2,.(2),/3,.(3),Ao)| < 2Q{T,,T2,T^,T,) 



V + 



IIAollni,2,3s^,3 

ll'^4lLl,2,3c„, 



nl/8 
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By (|3.5|) , this proves that this term is very small. This inequality singles out the fourth 
coordinate for a special role, but the proof, presented in full in this case, holds in full 
generality, so completes this case. 

Apply Lemma UTH with Tj = Tj and fj = fj^e(i) as above. The estimate we get from this 
Lemma is (|7.2[) , with the terms in (|7.3|) — (|7.7|l estimated in Lemma r7.15[ The particular point 
to observe is that the function Z has a small conditional variance (|7.17|l . These conditional 
estimates hold on the support of the product that occurs in (|7.6|) . Hence, we can estimate 

|Q(/u(i),/2,.(2),/3,.(3), Ao)| < Ui(Ta)i/2 • XJ^iT^fl' ■ \J,{T^f'' ■ U,{T„ T^, T,,f,,of' 



xE[z(Ta,T2,T3)| [] Ux^.^^^M 



X ||r4||ni,2,3Hi,,,3 



V + 



a)6|0)li'2'3lx|0|W 
ll/4,ollniA3Hi,2,3 



\\T4\\n^,2.3H,,2,3 . 

In the last line, f is a small quantity arising from the conditional variance estimate (|5.15|) . 
The key identity is (|7.16)) . In it, observe that 



U4(T4, T3, T2, Ti) = l|T4||'^,,3H,,3 ■ ^(^(^1' ^2' ^3) I n ^4(X^1^2,3)) I 

Therefore, we have 

Q(ri, T2, Ts, T,) = Ui(ri)i/2 . \J2iT2f' ■ Vsinf' 

( \^'^ 

xEZ(Ti,r2,r3)| W r4(x;^;2,3|) 

V a)6|0|llA3lx|0)l4l ' 

X JU + ||r4||nl,2.3Hi,2,3) 

And this completes the proof of (|7.20|) and hence the Lemma. 



n 



To apply Lemma 17.11 to prove Lemma |3.13[ we will need estimates for the terms in 
(|7.3|l — (|7.6|| . We turn to this next, discussing the estimates for the terms Uy. The estimates 



for Z(T, T, r, T) as defined in (|7.7[) we discuss in the next Lemma. 
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7.21 Lemma. We have the estimates below for the forms Uj defined in (|7.3[) — (|7.6|) . 

Ui(T) = 6r|iUi(ri), 

(7.22) U2{T) = 5l^,V2{T2), 

(7.23) UsiT) = 5',^,U3{T,), 

T/ze implied constant d in the definition of = can be taken to be P(T \ H x H x H) to some large 
power. 



Proof. The equality (|7.24[) is a corollary to part 2 of Lemma 15.41 and Definition 15.21 The 



other parts of the Lemma are also corollaries to the same fact, but not as stated, but with 
the role of 74 in Definition 15 .21 replaced by that of T2 for (|7.22|) , and T3 for (|7.23|) . n 



We turn to the analysis of the term Z{T, T, T) as defined in (|7.7|) . 
7.25 Lemma. We have the estimates below where Z = Z{T, T, T). 

3 
(7.26) E^o ^1 g„ (Z| m = rf 5ti,xE^o ,1 g„ (Z(ri, 72, T3) I m , 

where L^ = H H ^i^^^I^)" 
The implied constant in = can be taken as in Lemma U^TSl 

Here, note that we are using the conditional expectation notation. As the random 
variable Z is supported on the event U c H° ^ 31 x T^m 2 31/ ^^ have 

/ry oryN -n-, / 7 I 7 T\ ^(1,2,31' |l,2,3r-"llA3l 

(7-27) E.,0^^^^^1 .Hn,JZ|Lr)- 



(7.28) Var(Z I U) 



|1.2,3|'^|l,2.3|'="llA3r ' ^ Eo 1 p„ U 

|1,2,3|' Il,2,3p-"|1A3| 

2 

E^O vl cW Z — I E,.0 vl cW Z| I EvO vl cH ul 

^|l,2,3r-^|l,2,3|^-"|lA3| V ^|l,2,3r-^|l,2,3|^-"ll-2-3| / V ■^|l,2,3|'^|l,2,3|^-"ll>2,3| / 

^^n,2,3|4,2,3|e«|lA3|^ 
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And the point of the Lemma is that the random variable Z is nearly constant on the set U, 
and we can compute that constant. 

Proof. We first calculate the denominator in (|7.27|) and (|7.28|) . This is relatively simple as 
the sets Rj^k are uniform in Sj X Sk, so that we can estimate 

3 

(7-29) E^o ,1 eHn,.,^ = TT5; TT 5%- 

;=1 1<;<A:<3 



We now turn to the numerator in (|7.27[) . The expectation of Z in (|7.27[) is thought of as 
a 12-linear form. Set 

n^^ = {0, i}ii^'^^i^3i X {o}4 , 1 < y < 3 . 

Set Q:it = Ufci ^+]- Foi" functions {/^, | cu G Q^t} define 

L(/^ I Q^) = lE;,i2,3eHi,2,3 /<!' • 

We are to prove the estimate 

3 

(7.30) L(T I n^) = [] b\^^ ■ L(r,- I £1^,, 1 < ; < 3) . 

/=i 

Expand To Ay = /y^i -/;,o, where /,;i = (5r | yTy. The leading term is then when f^x occurs in 
all twelve positions. But, then we have the Second Conservation of Densities Proposition 
at our disposal, so that (|7.30[) follows from Proposition 16.41 



The ratio of (|7.30|) and (|7.29|l proves (|7.26|) , provided the other terms arising from the 



expansion of the 12-linear form are all sufficiently small. That is, we should see that for all 
TP- - 1 selections of /;>(«) s {/;,o / //,i } for tu e O^y, 1 < / < 3, with at least one f],e{ai) = fj,o we 
have 

(7.31) |L(/},,(^)|Q^)|<KL(r|Q^), 

for a suitably small constant k. 

If we use the same line of reasoning that we have before, this would lead to a (yet) 
longer multi-linear form. We therefore present the following variant of the argument 
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used thus far. We prove (|7.31|) under the following assumptions. For some co G O^i, we 
have fi,e{a}) = fi,o = T - driiTi. Moreover, this happens for co = 0, which we can assume 
after a change of variables. Finally, let /small = {] = 2,3 \ 6t\j < 5t\i}- We assume that 
/;>(a)) = ^T\jTj for all j G /small- This can also be assumed, after a permutation of the 
coordinates. We now prove the inequality 



V + 



II f l|8 nl/8 

llA0lln|2,3,4| 



\\T 



llln|2,3,4| - 



(7.32) |L(/},,(^) I Q^)| < f] 6'^ , . • L{Tj \ D^j , 1 < ; < 3) 

;'e/small 

Here, v will be a very small positive constant. Our assumption (|3.5[) , together with the 
assumption about /small permits us to conclude (|7.31|l from this inequality. In particular. 



we can accumulate a large number of powers of 6t 1 1 from (|3.5|) . The essential point, is that 
we accumulate the correct power on the densities dj \ j for / G /small/ as there is no a priori 
reason that the different densities 67 1 j need be comparable. 

But, (|7.32|) follows from application of the inequality (|6.8|) , and so our proof of the 
Lemma is complete. 

n 
Proof of Lemma \3l3\ Write A = fo+ fi where fi = 6a\ tT. We expand 

Q(A,A,A,A) = Yj Q{fe(l), fed), fe(?), fe(^)) ■ 

eeMi 

The leading term is for the function e = 1. It is 6^ , ^ Q(r, T, T, T), with the latter expression 
estimated in (|7.19|) . 



All other choices of e have at least one choice choice of 1 < / < 4 for which we have 
e(y) = 0. We claim that for all of these we have the estimate 

(7.33) |Q(/,(i), /,(2), /,(3), /.(4))l <Kb' Q(T, T, T, T) . 



This depends upon the assumption (|3.15|) . For k < 2 '^^, this will show that Q(A, A, A, A) > 
\^\\T Q(^' ^' ^' ^)- Fi'om this, we conclude that the number of corners in A is at least 

Q(A, A, A, A)|H|4 - \A\ >\d\^j Q{T, T, T, T)\Hf - |A| > 

Here, we subtract off |A|, as the average Q(A, A, A, A) includes the 'trivial corners' where all 
four points in the corner are the same.; The inequality holds by (|3.14|) , and this completes 
the proof. 
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We prove (|7.33|l for e(4) = 0, with the other cases following by symmetry. Apply 
Lemma [7^11 with Tj = T, and fi = /o- This gives us the inequality 



IQ(/.(i),/.(2),/.(3),/o)l < UiiTY'' ■ UiiTY'"' ■ UsiTY'' ■ U4(/o, T, T, T) 



1/4 



\l/8 



\l/8 



The terms Uy(T) for / = 1, 2, 3 are estimated in Lemma [7.211 The definition of U4(/o, T, T, T) 
in (|7.6|) depends upon Z, which has its properties listed in Lemma 17.251 This leads us to 
the estimate 

Q(/.(i),/.(2),/.(3),/o)l < Ui(r)i/2 . u^(Tf' ■ U3(r)i/« • E(z i uf^' 

\\fo\\n{l,2,3] 



Our proof is complete. 



X ||T||n|i,2,3| 



V + 



l|T||n|i,2,3| 
4 

< [] 6r I f X Ui(ra)i/2 . U2(T2)i/^ • V^iT,) 

xE(Z(ri,r2,r3)|LZ)^/« 

X ||r4||n,i,2,3| ■ ^ + ^;^ 

1 1 J lln|l,2,3| 

< Q(T, r, r, T) u + — 

IM llnjlASI 



1/8 



D 



8 The Paley-Zygmund Inequality for the Box Norm and the 
setT 

Let us recall the following classical result. 

8.1 The Paley-Zygmund Inequality. There zs a < c < 1 so that for all random variables 
-1 < Z < 1 with W.Z = Owe have P(Z > cEZ^) > cEZ^ . 

Our central purpose in this section is to provide extensions of this result to the case 
where the assumption on the standard deviation of the random variable is replaced by 
an assumption on the Box Norm. Extensions are provided into two different settings, an 
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'unweighted' and a 'weighted' one. Indeed, in the unweighted case, we will only require 
the two dimensional version of this inequality. 

8.2 The Paley-Zygmund Inequality for the Box Norm. There is a constant c(2), and t{2) > 1 
so that the following holds. For all finite sets Xt, 1 < t < 2, and subsets A c Xji^2|/ set 6 = P(A) 
and o = \\A- ^{A)\\fjn.2\v . There are subsets 

x;cx,, 2 = 1,2, 

P(X;)>c(2)(cT5y(2), 
P(A I X; 2) > 6 + c(2)(6c7)'(2) . 

We refer the reader to E Proposition 5.7] or IITSl Lemma 3.4] for a proof of this Lemma. 

We need a more general version of the Paley-Zygmund Inequality for the Box Norm, is 
based upon the properties of the sets A cT c Tj. We need two Lemmas, with very similar 
proofs, accordingly we state one Lemma. Our Lemmas should be coordinate-free, but to 
ease the burden of notation, we state them distinguishing the coordinate X4 for a special 
role. 

8.3 Lemma. There are constants c > and C, p > 1 so that the following holds. Suppose that T is 
a T-system as in (|3.3|) , which satisfies (|3.7[) and (|3.6|) . Let U cV CT4. Assume that V e {T4, T}. 

\\U-F{U\V)V\U.,^s,,,, ^ 

and that V is {A,d,4)-uniform, (Recall Definition \5.2\ ) where 

(8.5) d = {tF{U I V)f . 
Then, there is a T-system 

(8.6) T' = {H, S[, R'^g, T'\l<kJ<4:,k<£} 
and a set V c T^, which satisfy 

[v = T' V=Ti 
8.7 \ 4 

^ ' \V'(iV V=T 

fpCT' I T, 
(8.8) 



P(T; I T4) > {tW{U I Ti)y V = Ti 



P(r' I T) > {tV{u I T)y V = T 

(8.9) V{U I r n y) > ¥{U I V) + c(t • ¥{U \ V)f . 
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The point of these estimates is that we have a little information about the new data, 
in (|8.7|| . There are some lower bounds on the probabilities of the elements of the new 
T-system given by the estimate (|8.8|) . And in (|8.9|) , we have that U has a slightly larger 
probability in T' O V. Note that we certainly do not assume that the new T-system 7~' 
satisfies the uniformity assumptions in the definition of admissibility. Definition 13 .41 

Proof of Lemma |3T6l To prove Lemma [3.161 apply Lemma 1531 with V = T, U = A, and 
T = xd ^ I J, w here k is as in (|3.15|| . The conclusions of Lemma 18.31 then imply those of 
Lemma [3.16[ n 



8.1 One-Dimensional Obstructions 

We carry out the proof of Lemma [83l Throughout, we use the expansion U = fi+fo where 
/i = i5(i I yV where 6u\v = P{U \ V). We will also use the notation 6y 1 4 = F{V \ T4). The key 
assumption (|8.4|) , which could hold due to lower-dimensional obstructions, and so there 
are two initial stages in which we address these obstructions. 

We begin by considering the possibility that (|8.4[) holds for some one-dimensional 
reason. Namely, let us assume that, for instance, we have 

> liciiduivT^f ■ 5l ■ 6l^,- 6l, ■ 6l, ■ 62,3. 

Note that the last expectation is estimated by virtue of our assumption on (4,5,4)- 
uniformity, recall (|5.3|) . Here, Ci > and ti > 1 are constants that we will specify below, 
based upon considerations in the next two stages of our argument. 



Let us rephrase (|8.10|) as 



(8.11) ^X2,3eR2,3\^x,esJo{Xl,X2,X3)\ > 5Ci(6u| yT)^^ " 64 ' 6y| 4 " ^1,2 " ^1,3 

where we have replaced the expectation over S2,3 = S2 x S3 by expectation over the smaller 
set R2,3- Of course, we have lE^^f^Sifoi^i, X2, ^3)! < ErjeSi ^(^i/ ^2/ ^3)- But, the variance of this 
last random variable over ^2,3 is nearly constant. Namely, 

2 

(8.12) Var;,236R23(Exi6Si^(Xl,X2,X3)) <i<^T^fE xieSi ^(Xi,X2,X3)] . 

^2,36R2,3 
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This is a corollary to Lemma [5. 141 

We are in a situation where we can apply the Paley-Zygmund inequality, Proposi- 
tion lS.ll Note that the random variable E;i;ieSi/o(^i/ ^ir ^3) is dominated in absolute value by 
EjieSi ^(^1/ ^2/ ^3), which has average value (on ^2,3) given by 

(8.13) E r^eSi V{Xi, X2, X3) = 6y I 4 • 61,2 " 61,3 • 64 . 

^2,36^2,3 

This follows from assumption and (|5.3|| . Moreover, by (|8.12|) , the random variable 
E.rieSi^(^i/^2/^3) has Very small variance on 7^2,3/ so that except for a negligible proba- 
bility, it is dominated by, say, twice its expectation. The key point here, is that in applying 
the Paley-Zygmund inequality, we can use the normalized variance given by the ratio 
(ISlTl) and (ICT) : 



Ex2,36S2.3|ExieSi/o(Xl,X2,X3)| \ci{bu W^Y' ^l^l ^ ^b\2^\ 



^2,36-^2,3 

= |ci((5u|yT)'^ 

Thus, we can estimate 

^23 ={^2,3 e R2,3 I 'E^,eSifo{Xl,X2,X^) > 5^Ci(6m yT)^iE xjeSi '^(^^^l, ^2, X3)} / 

^2,36^2,3 

(8.14) P(i^2,3l^2,3)>^Ci(5myT)^ 

— ' 

We conclude the Lemma by taking the set R2 3 in (|8.6|) as above, T' = T n ^2,3/ ^rid the 

other data is unchanged. liV = T^, the new set V = V ■ %o,, so that ^7) holds. That (18:8b 
holds follows from (|8.14[) , and several applications of (|4.7|) . And that (|8.9|) holds follows 



from construction of R2 3- 



8.2 Two-Dimensional Obstructions 



We continue the proof assuming that (|8.10|) fails as written, and also fails under any 



permutation of the variables Xi,X2/ and X3. The potential lower dimensional obstruction 
are now two-dimensional in nature. We could have for instance 

(8.15) E,,esJI/ollj2,3s,3 ^ C2(6m vT)^^E,,esJl^llj2,3s,3 • 
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Here, t2, C2 > are constants that are to be specified, based upon considerations in the next 
stage of the argument. The last expectation can be computed exactly, and is 



^^resMWU^s,, 



(8.16) 



= [645y|4f n ^M- 
l</</c<3 



Of course we have ||/o 



< 



. Still, the deduction of the Lemma in this case 



doesn't follow from a a straight forward application of Lemma |8]2] in two dimensions, as 
we are in the weighted case. This argument is the one that relates the constants Ci, h and 
constants C2, ^2- 



Following notation used in the proof of Lemma lS^ we define a four linear term which 
arises from (|8.15|) . 



(8.17) 



B4(/o,0//o,l//l,0//l,l) - E xieSi, I I /e(^l/^2 3)- 

1 ^c ' 

^2,3'^2,3^^2,3 e6|0,lp 



Note that the left-hand-side of i8.15} is B4(/o, /o, /o, /o), and that ExjeSi 



B,iV,V,V,V), 



which is given in (|8.16|) . 



Our central claims are these inequalities, which hold for Ci, f 1 sufficiently large, in terms 

of C2, ^2- 



(8.18) 
(8.19) 

(8.20) 
(8.21) 

(8.22) 



-g^^^^^^^^>6^|^+,C2(6u,^T)S 



B4(LZ, LZ, U, V) 



<8Ci(6u|yT)S 



^1^ B4(y,y,y,y) 

Zv := E X16S1 l^(Xi,X2/^3)V(Xi,X2/^3)y(Xi,X2,X3)V(Xi,X2/^3), 
4 3652,3 

2,3 ^'■^ 

Var,o ,s„(Zy) < V^ • B4(F, V, V, Vf 



'Z'U '•— E xieSi U{Xi,X2,X^)U{Xi,X2,Xj)U{Xi,X2,X^)V{Xi,X2,X^), 

4,3622,3 

E43eS2,3(Zti)=B4(mr,m^)., 
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(8.23) 



Var,o ,s„(Zu) < 31c,{du\vTr^MV. V, V, Vf ■ 



Notice that the constant h of (|8.10|) appears in the estimates (|8.19|) and (|8.23|) . We take 
ti > 2^2 + 3. In (|8.23|) , note that we have three occurrences of 11 and one of V. The 
expectation of Z is the term in (|8.19)| . 



Proo/o/ (|8.18|) . The denominator on the left-hand-side is estimated in (|8.16|| . So we estimate 
the numerator. We use the expansion U = f^+ fo four times to write B4^{U, U, U, U) as a sum 
of sixteen terms. 

B4(U, U, U, U) = 2_j B4(/e(0,0),/e(0,l)//e(l,0)//e(l,l)) 

where M4 denotes the collection of sixteen maps from {0, Ip into {0, 1}. The two significant 
terms are associated to the maps e = and e = 1. 

B4(/i, /i, /i, /i) = 6u I y MV, V, V, V) 
B4(/o,/o,/o,/o) > C2{bu\Yxt B4(V, V, V, V) 



The first is by definition of /i = 6[i | y F, while the second is by assumption (|8.15|) . We should 
argue that the sum of the remaining fourteen choices of e are small. But this follows from 
the fact that (|8.11|) fails, and the inequality (|5.18|) . For any choice of e ^ 0, 1, the central 
hypothesis leading to that inequality holds. Of course, it is important to use the fact that 
the one-dimensional obstructions are not in place at this point. 
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Proof of \S. 19) . In B4(LZ, U, U, V), expand each U as fi + fo. The leading term is when each 
U is replaced by /o, giving us 

B4(/i, /i, /i, V) = 6l^y B,{V, V, V, V) . 

The remaining seven terms are of the form B4(/e(o,o),/e(o,i)//e(i,o)/ ^)r where e ^ 1. But then, 
the estimate (|5.18|) applies, so this proof is finished. 



n 



Proof of \S. 2d) and (|8.21|) . The equation (|8.20|) is by definition, and (|8.21[) is a consequence 
of assumption on V and Lemma [5.14[ D 
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Proo/o/ (18^221) and S823 . The equation (l8^22h is by definition of Zu. The inequality (l8^23h is 
very similar in spirit to Lemma [5.14[ but does not explicitly follow from that Lemma. 

To compute the variance of Zu, we need the following 8-linear form. 

Uigl, gl, g3,g4, g5, gb, g7, gs) 

dSV^l' 2' 3/d6\-^i/ ^2' 3)g7\^\' -^2' -^3/68v-^i/ -^2' ?>' 

The point of this definition is that E;c23eS23■Z^ = LsCU' ^ U' V^ U' U, U, V), and we want to 
establish the estimate 

We already have (|8.19|) , which gives us an estimate of E;c23eS23Zu. It follows from V 
being (4, ^, 4)-uniform that we have 

6jj I y U{V, V, V, V, V, V, V, V) = K I V ■ Mv> v> v> v)f 
And so, we should verify that 

24) 1^'^^' ^' ^' ^' ^' ^' ^' ^^"^'^ I ^ ^'^^' ^' ^' ^' ^' ^' ^' ^^1 

< 20ci(6u I yT)^> L8(F, V, V, F, F, V, V, V) . 



The key assumption is that (|8.10|) fails, which in turn suggests that we appeal to the 
inequality (|5.18|) . But, in the definition of Lg, no single variable occurs in just one function, 
the key hypothesis needed to apply (|5.18|) . This fact brings us to the observation that, 
for instance, in the definition of Lg, only gj and gg are functions of x^. Moreover, we are 
interested in the case where g^ - V, di 'highly uniform' function, and gy = U = fi + /q. 
Thus, our strategy is to selectively replace occurrences of U in Ls{U, U, U, V, U, U, U, V) in 
such a way that at each stage, there is single occurrence of /o, and that there is a variable 
in /o which is only occurs in instances of V. 

Specifically, we write 

6 

Ls{U,U,U,V,U,U,U,V)-6l,^yLs{V,V,V,V,V,V,V,V) = Y^D„,, 

m=l 
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^^6i,,yLs{U,UJo,V,V,V,V,V), 



D3 ^ 5l,^yLsiUMM,VJo.V,V,V) , D. 

Ds = 6l^^yLs{UJo,V,V,V,V,V,V), D, 
Then, (|8.24[) will follow from the estimate 
(8.25) |D,| < 3ci(6u , vtY' U{V, V, V,V,V,V,V,V) , 1 < m < 6 



e^6l,^yLs{fo,V,V,V,V,V,V,V). 



Each of the six inequalities in (|8.25|) follow from the same principle, and so we will only 
explicitly discuss the estimate for Di. Write 

Di = E^o^^ .^i^^gg^^^ u(x-^, X2, x^)Ll{x.^^, X2, Xj)Ll{x-^, x^, x^) V(x-^, x^, X3) 

/\ L-1 \>^i / -^9/ -^"5/ ^-^ \"^1 /-^oz-^Q/ * -^^ y^pSt / \ 1 / "^9/ %} \ 1 / "9/ "'^/ * 

Apply the Cauchy-Schwartz inequality in all variables except X2 e S2. In so doing, apply 
the First Proposition on Conservation of Densities, Proposition |5.11[ and the assumption 
of V being (4, ^, 4)-unif orm to conclude that 

(8.26) IDil < U{y, V, V, V, V, V, Vr V)[ ^ + ^Jv^v^vJ) ] 



1 ^3 ^0\ 



4 ^2 ^2\ 



4 ^3 ^2\ 



^4(5'!/ S"2/ S'3/ ^"4) -E xJeSi S"l('*^l''*^2''*^3)S"2(^l/^2''*^3)S"3(^l/^2'^3)S"l('*^l''*^2'^3) 



x^,4eS3 



In the right-hand-side of (|8.26|) , observe that we can write 



Uifojo, V,V) = 1E ,i^s, Mx{,xi,xl)Mx[,xi,xl) ■ Y 

x?,x?eS2 



xOeSs 



1 ^2 ^2\ 



J ^3 ^2\ 



J- — J- \yi -1^ -^9/ "^97 — -^^ y^ F S T \ 1/"^9/'^/ \ 1/9/"^'^7 / • 



It follows from Lemma 15.141 and assumption on V, that Y is a random variable with 
non-zero mean and very small variance on the event V{x\, x^, x^) V(xJ, ^2, x^). Hence, 

Uifojo, V,V) ^ r^ UifoJoAA) 

U{V,V,V,V) - ^ L4(V,F,1,1) 

But the last ratio is controlled by the failure of (|8.10|) , so our proof of (|8.25|) , and hence 
(|8.23|) is complete. n 
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We need to conclude the proof of the Lemma, assuming the inequalities (|8.18|) — (|8.23|) . 
Select a point x^^e S2,3 at random, and define the data in (|8.6|) as follows. 

S[{X2^^) = {Xi I {Xi,X2,X^) G U} , 
^1,3("*^2,3) ~ {(^1/^3) I (^1/^2''*^3)'(''^1''*^2''*^3) ^ ^! ' 

With this definition, it is clear that ((83) holds, namely if V = 74, we have V = T^ = T'(x° 3). 
No change is made to the data not listed here, namely S2, S3 and 52,3- The point of these 
definitions is that we have 

E ,,es, r{xl,) = B,{U,U,U,V), 



and P ,.j6Si (^'(^3)) = Zu(x^3) = Zu, in the notation of (ISlIIb and (H^S- 

^2,3^^2,3 

Define the event 

S2,3 = {x° 3 G S2,3 I \Zu - MU, U, U, V)\ < [C2{6u I vt)V'^^ B4(y, V, V, V) 

\Zv - B4( V, V, V, V)\ < [C2(6u , vt)\'''^ B4(F, V, V, V)] . 

It follows from (l8^20ll — (l8^23l) that we have 

P(S2,3-S2,3)<32[C2(6my0r^^'- 

Moreover, for ti > 4^2, notice that we would have inequalities that look quite similar to 
(|8.18|) and (|8.19|) . In particular, we will have 



\^xi^es,,^u - MU, u, u,v)\< {C2{du I vt)Y'i^ Mv, y, y, y) , 

with a similar inequality for Zy- Hence, we can conclude the proof of the Lemma, by 
noting that 

ZJL ^"jr Zij 
U ^ -^2,3^52,3 " ^ 

■^2,3^^2,3 -^2,3^^2,3 
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8.3 Three-Dimensional Obstructions 



We proceed under the assumption that that both (|8.10|) and (|8.15|) fail, as written and under 
all permutations of coordinates. We have specified Ci, f i as functions of C2, ^2/ and this 
argument will specify these last two constants. 



We need the 8-linear form, the analog of (|8.17[) given by 

Bsife I e G {0, If''''^) = E,,,3es,,3 n /^(42,3) • 

The relevant facts we need about this form concern these values. Set 

BsiW] = BsiW I e G {0, l}'^'^'^') , W=U,V 
Bs[U, V] = BsCLZ, ...M,V\ee{0, ll'i'^'^l) , 

where the lone V occurs in the {1}^'^'-^ position. Indeed, note that Bs[U] = ||fi||^i,2,: 



□l'2'3Si,3- 



The facts we need are these. 



(8.27) 
(8.28) 



MU] 



^'^1^ Bs[V] 



>6^ + 1t^ 



Bs^-^l^ 2 



<Ubn\vTf\ 



(8.29) 
(8.30) 



e6jO,l)liA3| 

E(ziu)=Mm, 

Var,o^^,s^^3(Z I U) < ^.{buiYTf^'BAVf . 



Proof of (|8.27|) . Consider B8[Lr]. Expand each occurrence of U as fi+ /o, where /i = 6u\ yV . 
This leads to 



(8.31) 



B8[tJ]=2],B8(/p(.)kG{0,lP'3l) 

peMg 



where Ms is the class of maps from {0, Ip'^'^l into {0, 1}. The leading term is p = 1, which is 
(8-32) ^l^v^An-bl^yW\t.,,,^^^. 
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The other significant term is p = 0, which is 

B8(/o I e G {0,1}'^'''^') = ll/oll',..,3s^^3 ^ ^'ll^lln^'^^3s,,3 • 

The last inequality follows from (|8.4|) . 

That leaves 2^-2 additional terms in Ms to consider. For each p G Mg which is not 
equivalent to or 1, the assumption for the inequality (|5.19|) holds. Namely, there is a 
choice of e G {0, Ijl^'^'^l, and choice of distinct i,k G {1,2,3} so that p(e) = 0, and for every 
other e', we have either £(/) i^ e'{j) or e{k) 4" e'{k). Therefore, the inequality (|5.19|) holds. 



Combining this inequality with our assumption that (|8.15|) fails, we see that this holds. 

(8.33) |B8(/p(.) k G {0, 1}|1'2'3|)| < c2{6uivTy^ X ll^ll',:,,3s,,3 • 

For C2 sufficiently small, and ^2 ^ 8, this completes the proof of (|8.27|) . 

n 

Proof of (\8. 28^ . Keeping the notation of (|8.31|) , we have 

Bs[U, V] = 5-^] y Y, ^sifpie) I e e {0, 1}^'''''^) 

peM'^ 

where Mg is the class of maps p G Ms such that p(l'^'^''^') = 1. The leading term is again 
p = 1, which is (|8.32|) above. The remaining 2^-1 terms all admit the bound (|8.33|) . 
Therefore, 

|Bs[U, V]-6uiv- 5lj\y\\V\\l,,,^J < 2\6uivTy^-' x WWl..^^^^^ . 

This proves (|8.28|l for C2 sufficiently small, and ^2 ^ 31. n 



Proof of ^8.29) and (|8.30|) . The equation (|8.29|) is just the definition of conditional expecta- 
tion. Note that as V is (4, 5, 4)-uniform, we have 



E 



x° x^ 6S1 7^-Z ■ ^- BslU/ ^] 

1,2,3' lAS^'^lAS 

(8.34) =6L|X|4 n ^1^ + ^' 

l<;<fc<3 
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(8.35) \e\<^,{5u\vTf°Bs[V], 

by (15:281) , and (l53ll . 



The inequality (|8.30|) is clearly a relative of Lemma 15.141 but does not follow from any 



principal like that which we have stated. Indeed, we will see that (|8.15|) is instrumental 



to this inequality, as it has been to the prior inequalities. Recalling (|5.16|) , we see that we 
need to estimate EZ^ • U. This is a linear form on U and V, which we now specify. Take 
Q c {0,1,2}^'^'^ be set of maps e : {1,2,3} -^ {0,1,2} such that the range of e does not 
include both 1 and 2. Then, 

(8.36) ^x^,,,eS,,,Z' ■ U = ^,,^^,s,,V{xl2,3)y(^l2,3) JJ ^(^2,3) " 

)=1,2,3 

There are 13 occurrences of U in this expression. (Of the 7 occurrences of U in B8[LZ, V], 
all but one get 'doubled' in the expression above.) Each occurrence is expanded as as 
/i + /o/ where /i = 6(,f | yF. The leading term is when each occurrence of U is replaced by 
/i. This leads to 

'i=l,2,3 eei^ 1<;<J:<3 

(8.37) ^6]^^^' H bl, = b'l^-Lv. 

l</</c<3 

Recall that this last expectation can be estimated by assumption that V is (4, d, 4)-uniform, 



see (|5.3|). 



In each of the 2^^ - 1 remaining terms, there is at least one occurrence of U which is 



replaced by /q. As in the previous two proofs, we are again in a situation in which (|5.19|) 



applies. Therefore, as (|8.15|) fails, each of these terms is at most 



(8.38) 2Lv[d' +C2{du\vT)'']. 



Therefore, for C2 sufficiently small, and ^2 sufficiently large, we can combine (|8.38|) , (|8.37[) 
and (|8.36)) to conclude that 



Exj2 36Si,2,3^^ ■ ^ - ^u\ v^v + e' 
(8.39) \e'\<c'^Lv{6uivTy' 
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Here, the implied constant in '=' depends upon the failure of the inequality (|8.15|) , and Li 
is defined in (|837|) . 



Now observe that combining (|8.34[) and (|8.36|) and (|8.37|) , we have 



(8.40) 
(8.41) 



l</</c<3 

= (eZ • uf + e" 

\e"\ < c',Bs[Vn{6u\vTy' + U^uivTf'f . 



In the last line, we have used (|8.35|) and (|8.39|) . Dividing (|8.40|) by P(Lr | 74)^, and using the 
estimate in (|8.41[) completes the proof of (|8.30|) . 



n 



We can complete the proof of Lemma [83l assuming the inequalities (|8.27|) — (|8.30|) . For 
a suitably generic point x° 2 3 ^ ^' ^^ define the new data in (|8.6)) to be 

'-'lV'^12 3/ ~ I 1 I 123 ^1' 

with a corresponding definition for S'^ix^^ 23^ ^^"^ '^3('*^i2 3)- ^^^ ^^* '^i2(''^i2 3) ^^ defined as 

i>-|^ 2V^l,2,3/ ~ l-'^l,2 '-'lV-^1,2,3-' '-'2V'^1,2,3/ I ''^1,2,3 ' ' 

with a corresponding definition for S^3(x'^2 3) ^^id '^2 3(-^i2 3)- ^^^^ '^^ ^^^' ^^^ ^^* ^'('*^i2 3) ^^ 
taken to be 



With these definitions, note that dST]) holds, that is if ^ = T4, then V = T'{x° 2 3) = ^^4 in the 
new 7~-system. The point of this definition is that 



with the last expression found in (|8.28|) . 

Now, set 

W = {x? 2,3 e U I P,;^^s^^3(T'(<2,3)) ^ i^L I V B8[l^]) . 
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It follows from (l829ll and (18301) that we have 

Ko^^es,JU-U') < F{U) ■ ({T5uivYBs[V]fvar{Z \ U) 
<F{U){t6u\v)''. 

Now, it will follow from the (4, d, 3)-uniformity of V, and Lemma [5 .141 that we have 



Var,o^jE,;^^,s^^3 II V{xl,^,)\V{xl,^,))<dBs[Vf 



Here, d is as in (|8.5|) . Therefore, it will follow that in the formula (|8.27[) , we can change the 
leading U{x'^23) ^Y ^'(^?2 3)- Namely, we have 

Bs[U -U',U,...,U]< MU -U',V,...,V] 
(8.42) <2{t6uiv)''MV]. 

We can conclude this proof by estimating as follows: For element ^5 2 3 ^ ^'' ^^ have 

Exj^gGSi^a ne£(0,l)iA3 ^(^1,2,3) 

sup ¥{U I T) = ^ 

^'^'3 " e*0,l 



> 



42,3-^12,3^51,2,3 ^'(^1,2,3) nee|0,l|i'^'3 ^(^1,2,3) 



E:,Oxi -651,2,3 ^'(^12,3)^(^1,2,3) nee|0,l|iA3 ^(^1,2,3) 
" " e*0,l 

>6u\v + W. 



The last line follows by combining (|8.27|) , (|8.28|) , and (|8.42[) , with this last inequality showing 



that modifications of (|8.27|) and (|8.28|) hold, with the leading (i(x° ^ 3) replaced by U'{x[ ^ 3). 



9 Proof of Unif ormizing Lemma 

We marshal several facts, and set some notations, before beginning the main lines of the 
proof of the Information Lemma [3.17[ 
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9.1 Martingales 

We will use basic facts about martingales. Let Z be a real-valued random variable on a 
probability space Q, bounded by one. And let P be a finite partition of Q. Elements of the 
partition we refer to as atoms. The conditional expectation ofZ relative to P is 



E(Z \P):=YjA- F{A)-^E{Z ■ A) , 



AeP 



Partition P refines Q iff each element of Q is a finite union of elements of P. In our 
application, all partitions will be a finite collection of sets. Let P„ be a sequence of refining 
partitions of Q, that is, P„ is a refining sequence of partitions means that P„+i refines P„ for 
all integers n. We will take Pq to be the trivial partition, namely Pq = {O}. 

The sequence of random variables E(Z | P„) is an example of a martingale. The sequence 
of random variables AZ„ = E(Z | P„)-E(Z | P„_i)forn > lis amartingaledifi^erence sequence. 
Then, the sum below is telescoping 



E(Z|P„) = E(Z|Po) + 2],AZ„,. 



m=l 



Observe that the martingale difference sequence is a sequence of pairwise orthogonal 
random variables. That is, for m < n, 

(9.1) EAZ„-AZ„ = 0. 

Indeed, as the partitions P„ are refining, and m < n, for each element E G P^, the random 
variable AZ^ is constant on E, while EAZ„ • E = 0. This leads us to: 

9.2 Proposition. Let < M < 1. Suppose that Z is a random variable bounded by 1, and that P„ 
is the sequence of refining partitions such that for an increasing sequence of integers t^ we have 

E[E(Z I P,,„_i)]2 + u< E[E(Z \Pjf, 1 < m < M . 

Then, M<u. 

9.3 Remark. Below, we will refer to an increasing sequence of integers as 'stopping times.' 
An extension of this definition, to make the stopping times certain sequences of measurable 
functions, is an essential tool in martingale theory. 
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Proof. Notice that the assumption tells us that E(AZf^,)^ > u. Indeed, since E(Z | P^^J 
E(Z I Pf„-i) + AZf^,, and orthogonality of martingale difference sequences, 

E{AZt,f = E[E(Z I Pjf - 2E[E(Z | PJ ■ E(Z | P,„_i)] + E[E(Z | P^^.^f] 
= E[E(Z I P,„,)]2 - E[E(Z I P,„_i)2] 
> u. 

We then have 

M 



1 > EZ^ > 2], E[E(AZf J2] > Nu , 



m=l 

D 



We will use the extension of the previous proposition. 

9.4 Corollary. Suppose that Q' c Q, where (Q, P) is a probability space. Let P be a partition of 
Q' into a finite number of sets. Let P^ be a sequence of refining partitions ofp, and tm{p),for p eP, 
be a set of stopping times so that for alll <m < M{p) we have 

E[E(p I Pupyif] + u< E[E(Z | P,„,(p))2] , p G P, 1 < m < M{p) . 

Then, 

Y,M{p)<u-K 

peP 

Proof We have 

M(p) M{p) 

peP pinP m-1 pinP m=l 

And this proves our Corollary. D 

Here is an extension of the previous propositions, where the conditional variance 
increment is permitted to be much smaller. 
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9.5 Proposition. Let < u,t <1, and C > 1. Suppose that < Z <lis a random variable, and 
that Pfn is the sequence of refining partitions, and that t^ is a sequence of stopping times such that 
for alll<m<M, 

]E[Z-E„] >T 
£„, := [p G P,,„_i I E[E(Z • p I Ptjf > E(Z | pf + uE{Z \ pf] 

Then, M < u~^t~^. 

Proof. Observe that for A^ := E(Z | Pt,„) - E(Z | Pt,„-i) we have the estimate 

E[A^,-EJ>M2E[E(Z|P,„,_ifEJ. 
Therefore, using Jensen's inequality, available to us as C > 1, 

MM M 

l>Y^E^l>Y^ BAlE^ > Y^ u^E[E{Z I Pt,„-ifEJ 

m=l m=l ra=l 

M 

> Yj u^E[E{Z I Pt„-i)E„^f > Muh^ . 
This proves the Proposition. n 



9.2 Partitions 

We need several partitions, which 'fit together' in an appropriate way. 

Let Q be a set with partition P. Let Q' c Q have partition P'. Say that P' is subordinate 
to P iff each atom p' G P' is contained in some atom p G P. We do not insist that every 
atom of P be a union of atoms from P', that is, we do not require that P' refine P. 

The minimum of two partitions P and P' of the same set Q is 

PAP' = {AnB|AGP,BG P'}. 

If P' is a partition of a subset Q' c Q, we use the same notation P A P' for a (maximal) 
partition of Q' subordinate to both P and P'. 
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Suppose that P is a partition in Q, and that P' is a partition of Q' c Q, that is subordinate 
to P. We define 

(9.6) multi(P' I P) = sup jllp' eP'\p' cp}. 

peP 



9.3 Useful Propositions 

This general proposition provides the motivation for the overall approach we take. 

9.7 Proposition. Let < v < 5 < 1. Let A <z T <z X be finite sets with P(A \T)>5 + v. LetP be 
a partition ofX, and let P' cP be any subset of P for which 

P([Jp)< W4. 

peP' 

Then, there is some element p eP -P' with 

P(T I p) > 2P(T I X) , P(A I T n p) > 6 + f . 

Proof Take P" to be all those elements p eP which are in P' or P(T | P) < |P(T | X). It is 
clear that we have 

p(a n y p I r) < f . 

peP" 

Applying the pigeonhole principle to those elements of P - P" proves the Proposition, n 

The 'energy increment' steps we take are governed by these two general propositions. 

9.8 Proposition. Let Abe a subset of a probability space (Q, P). Suppose that the there is a subset 
B c Qfor which we have 

P(A I B) = P(A) + v> P(A) . 

Then, for the partition Pg of Q. generated by B, we have 

(9.9) E[E(A I Pb)]^ > P(A)2 + P(B) • v^ . 
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In application, we will have v , P(B) > 1P{A)^, for an absolute constant C. Thus, we 
have 



E[E(A I Pb)]^ > F{Af + P{Af^ . 

Proof. Let us set a = P(A), F(B) = |6 so that 

P(A n B) = (« + v)|6, P(A n B') = (1 - |6)« - V|6 . 

We can calculate the left-hand side of (|9.9|) directly. 

E[E(A I Pb)]^ = P(B)[P(A I B)f + (1 - P(B))[P(A | B')]^ 
= P(A n B) • P(A I B) + P(A n B'^)P(A | B') 
= {a + vfp + (1 - p)-\{l - p)a + v^f 

>a^ + v^|S . 
And this proves the proposition. n 

This trivial extension of the previous proposition is the one that we use. 

9.10 Proposition. Let Abe a subset of a probability space (Q, P), and let P be a finite partition of 
Q so that this condition holds. For a subset Q c P, suppose the following holds. For each element 
p G P, there is a further subset p' so that 

P(A|p')>P(A|p) + v, peQ. 

P(|Jp')>T. 

peP 

Then, for the partition P' which refines both P and {p' \p e Q}, we have the estimate 

E[E(A I P')]^ > E[E(A I P)f + Tv^ . 

We will appeal to a simple bound for the tower notation given by 

(9.11) 2Tn:=2", 2 tt ^ := 2 T (2 TT ^ - !)• 

In the function 2 H n is called the Ackerman function, and its inverse is 

(9.12) log, N = min{n | N < 2 tt ?^} • 
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9.13 Proposition. For integers €,u,v>2 define 

i/^(0, u,v) = u-v , ip{£ + l,u,v) = 2^1 {u- ip{£, u, v)) 

We have the estimate 

4>{{, u,v)<2 tt [i + log, 2uv] . 

Proof. Define 

_ log^M _ log2 u{l+ Ck) 

^^ ~ uxp{£ - 1) ' ^^'^ ~ uxpik - 1) ■ 

It is elementary to see that ei < 1. 

The point of these definitions is that we have 

ip{i, w, z;) = 2 t [(1 + ee)uxp{£ - 1)] 

= 2T[2T[(l + e^-iW^-2)] 



C times 



=2T[2T[---2T[(l+ei)M---]] 
<2tT[^ + log,2uz;]. 



n 



The following definition is used to make a quicker appeal to Lemma lS^ and its relative 
Lemma I8.3[ 

9.14 Definition. Consider a subset S of a set X, a partition P, and a positive parameter A. 
Say that P' is (S, A, P)-good iff P' refines P and 

(9.15) E(E(S I P'f) > E(E(S | P)^) + A . 



9.4 The LZ(3) Norm 

In this section we discuss the Lemmas needed to obtain sets that are uniform with respect 
to the Gowers U{3) norm. 
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9.16 Definition. We call a partition of H X H X H affine iff all atoms of the partition are of 
the form V^x V2X V3, where Vi are all translates of the same subspace V < H. This is an 
essential definition for us, as an affine partition, in say the basis (ei, e2, es) is also affine in 
any choice of basis formed from these three vectors. Each atom of an affine partition is, 
after translation, a copy of H x H x H with a lower dimension. 

In particular, given Sj, 1 < / < 4, and an affine partition P, for each atom « G P, it 
makes sense to compute the Gowers uniformity norm of Sj relative to the atom a. That 
is, the atom a determines an affine subspace Vj in the coordinate ej. After translation, we 
could assume that Vj is actually a subspace, in which we can unambiguously compute the 
Gowers U{3) norm. This is what we mean by 

\\Sj - nSj I a)\\u(3),a 

The codimemsion of an affine partition, written as codim(P) is the maximum codimension 
of Vi in H, for all Vi X ^2 X ^3 G P. Clearly, we have 

IPI ^ ^codim(P) 

We need the following version of the Inverse Theorem for the U{3) Norm, in a 

9.17 Inverse Theorem for the Gowers U{3) Norm. There are constant 0<c<C<ooso that 
the following holds. Let S c H and assume that dim(H) > lOCw"*- and 

IIS - P(S I H)||u(3) > u 

Then, there is an affine subspace H' ofS so that dim(H') > dim(H) - Cw""- and 

F(S I H') > F(S I H) + cu^ . 

We emphasize that the exact value of the estimates on the co-dimensions above are 
important in the study of four-term progressions, but the exact form of these estimates are 
not important to the proof of our Main Theorem, Theorem 11.21 For this result, see IllOl p. 
27—28]. 

We will use this elementary observation: If P, P' are affine partitions, then 

codim(P A P') < codim(P) + codim(P') . 
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9.18 Proposition. There is a constant C so that the following holds for all < u,t < 1 the 
following holds. Let Sj, 1 < j <4be sets in the jth coordinate. Then there is an affine partition P 
ofH xHxH, satisfying codim(P) < (C/ut)'-, so that 

P(A G P I SUp||S;||u(3),A >U)<T. 

i 

Proof. Here is an important point in the proof. For an affine partition P, suppose there is 
an atom A G P such that 

||S;-P(Sy|A)||u(3),A>U 

Let Aj denote the affine subspace for coordinate ey. Then, there is a partition P^ of Ay into 
affine subspaces of codimension < Cu~^, for which we have 

EA/E(Sy n Aj I Pa?) > ^A^iSj n Ajf + cu"" . 

A moments thought shows that there is then an affine refinement P' of P, in which only 
the atom A is further refined, for which we have 

]E(E(Sy I P')^) > E(E(S; I Pf) + CM^P(A). 

Indeed, since the atom A is the product of translates of the same subspace Aj, we impose an 
appropriate translate of the partition Pa on the two choices of the remaining coordinates. 
The codimension of the refining partition has increased by only Cm"'-. 

Here is the principal line of the argument. We construct a sequence of refining affine 
partitions P„, and a sequence of stopping times Tj^]^, for 1 < ; < 4 and k>l, which are used 
to running time of the recursive procedure below. 

Let P be an affine partition. Notice that there is some C > so that the following is 
sufficient condition for the existence of a {Sj, ifx, P)-good partition P': 

P(A G P I ||S;||u(3),A >u)> t/4 

In addition, P' can be taken to be affine and codim(P') < codim(P) + Cu"*-. This is a 
consequence of the discussion at the beginning of the proof. The notion of a good partition 
is defined in Definition 19. 14[ 

Initialize variables 

Po^{HxHxH}, n^O, Ty,o = 0, kj^O 
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Likewise set Tyo = WHILE for some 1 < / < 4, there is an affine (Sy, m''t/4, P„)-good 
partition P', with codim(P„+i) < codim(P„) + Cu~'~, increment 



n + 1, kj <—kj + l. 



Define Zj^kj = n, and P„+i = P'. 



As the underlying space is finite dimensional, this WHILE loop must stop. The se- 
quence of stopping times Tj^i,...,Tj^k cannot exceed (tu)~^- Indeed, the hypotheses of 
Proposition 19.21 hold, proving this claim immediately. The conclusions of the Lemma are 
then immediate from the recursion, and the observation (|9.4|) . 

D 

In fact, we will rely upon the following variant of the the previous result. 

9.19 Lemma. There is a constant C so that the following holds for all < u,t < 1 the following 
holds. Let Sj, \ < j <Ahea collection of sets in the jth coordinate. Then there is an affine partition 
P ofH xHxHof 

4 c 

codim(P) < [(mt)-^ 11 '"^^'l ^^^ ^^"^ ^ ^ ' ^^PH^iH^OM > u) < t . 

This proof is a simple variant of the previous proof. Note that the codimension of the 
the partition admits a substantially worse bound. This is because we have to keep track 
of a running time for each possible set S G IJ^ Sj. 



9.5 The Box Norm in Two Variables 

The goal of this subsection is Lemma |9.32[ which combines the fact about the U{3) norm 
in Lemma 19.191 with some facts about the Box Norm. We begin with some generalities 
on the Box Norm in two variables. Recall the definition of P' being (S, 6, P)-good given in 
(l9J5b above. 

9.20 Proposition. There is a C^ so that for all < u,t < 1 the following holds. Let Z c X x Y, 
and let Px, Py be partitions ofX and Y. Suppose that the following condition holds. 

P(E I X X Y) > T , where 
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E = {{Px,Py) e Px X Py I \\Z - P(Z I p^ X Py)||tf,./p,xpy > u} . 

Then, there are partitions P^ and Py so that 

(9.21) Px X Py is (Z, TW^^ Px X PY)-good. 

(9.22) multi(Px | Px) < 2 T t^Py / and likewise for P;,. 

Here, C2 could be taken to be 4. 

Note that the estimate (|9.22|) , recursively applied, leads to tower power style bounds. 

Proof. For each {px,Py) s E, Lemma \8?2\ assures us the existence of a partition Pxiy) of px 
into two elements, and a partition Pyix) of py into two elements so that Pxiy) x Py{x) is 
(Z DpxX Py, u'-2, p^ X pj/)-good. (There is no t in this last assertion.) 

We take 

P^ = PxA^P,(y), 

yePr 

and likewise for Py. It is clear that (|9.22|) holds. By the assumption that P(E) > t, and the 
martingale property (|9.1|) , it follows that (|9.21|) holds. n 

9.23 Proposition. There is a C2 > so that for all < u, t < 1 the following holds. Let Z dXxY, 
and let Px, Py te partitions of X and Y. Let Pz be a partition ofZ that is subordinate to Px x Py. 
Suppose that the following condition holds. 

P(E|Z)>T, 

E = {zePz\\\z- P(Z I X, X y.)||n.,yx,xY. > u} . 

Here, z c X^x Y^, and X^ G Px and Y^ G Py. X^, Y^ must exist as Pz is subordinate to Px x Py. 
Then, there is a partition P^ and P'^ so that 

(9.24) P^ X Py is (Pz, iu^\ Px X Py)-good. 

(9.25) multi(Px | Px) < 2 T Wy) ■ multi(Pz I Px X Py)] , and likewise for P;.. 

Here, C2 could be taken to be 4. 

Note in particular the form of the tower in (|9.25|) , with the notation as in (|9.11|) 
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Proof. For each z G E, there is a partition P^ into two elements, and likewise for Py so 
that P;^ X P;, is (z, 1/^2, {X^} x {y,})-good. This follows from ^92^ and (19:22b . 



Define the partition P^ to be 



P' = P. A A p; 



zeE 



Observe that (|9.25[) follows. Indeed, for each x G Px, we could have up to (jlPy) • 
multi(Pz I Px X Py) many sets to form the minimum partition over, leading to (|9.25|) . 

Use the basic fact about martingales, (|9.1|) , and the assumption that P(E) > t to conclude 
that dm holds. n 



We make a definition that we use in this section, and the next. 
9.26 Definition. We say that the data 

(9.27) ^ = {HxHxH, Ph, Si, P,, Ky,,, P^,,, T, Pr | 1 < z < 4, 1 < ; < fc < 4} 

is a partition-system iff 

• Ph is an affine partition oiH X H xH. 

• Si c H, and P, is a partition of S, that is subordinate to Ph, 1 < i < 4. 

• Ry^jc c Sy X S/c, and P^;^ is a partition of Rj^^ that is subordinate to Pj A Sk and Sy x Pjt, 
1 < ;■ < fc < 4. 

• TcHxHxH is such that T c ^^^c, 1 < / < fc < 4. 

We stress that all partitions are collections of subsets oiHxHxH. Set 



Pt/:=PM a Pj,k, 1<^<4, 



l<;</c<4 



(9.28) Pi(^) = ^multi(PHPH), 
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(9.29) P2(<S)= Yj multi(P,|P;;fc), 

l<;'<fc<4 

(9.30) PT(.S) = multi(PT|PH), 

These last quantities are some counting functions that we will need to keep track of. 

A trivial partition-system is a partition-system in which each of the partitions are trivial. 
For each t G Pj, we take 

(9.31) ^3(0 = {Ht,i X Ht,2 X Ht,3 , sui , rt:j,k , t \1 <i <4:, 1 < j <k <4} 
to be the trivial partition-system associated to t. Namely, we have 

• t c Hf 1 X Hf 2 X Hf 3. Here, Hf 1 x Hf 2 x Hf 3 may be the product of affine subspaces in 
H X Hx H, but all relevant notions extend to this setting. 

• St:j,k e Pj^k, with St:j,k C Hf4 X Ht,2 X Ht,3, and t = Al<;<lc<4 St:j,k- 

This is the Lemma that will be applied in the next section. 

9.32 Lemma. Let Ci > 1 be given. There are finite functions ^2-n : [0,1]^ x N^ — > N and 
^codim : [0, 1]^ X N^ — > N so that the following holds for all < U2, W3T < 1. 



For all partition-systems S, as in (|9.27|| , there is a partition-system 



(9.33) S' = {HxHxH, P'^, Si, P',,Rj,k, P]^, T, P^ | 1 < / < 4, 1 < ; < fc < 4} 
which refines S, so that these conditions are met. For 1 < z < 4 and 1 < ;', fc < 4, 

(9.34) C0dim(P;,) < ^codim(l^3, T, Pi(<S), P2(<S)) , 

(9.35) multi(P; I Pd < W2-uiu2, t. Pi (5), P2(.S)) , 

(9.36) niulti(P;.^ I Pj A P^) < multi(Py;c I P; X P^ , 

(9.37) P(£2,;,)c|SyXS,)<T, 



E2,j,k = \ rj,k e P'jj. I rp, csjDSk, s^ eP'^,v = i,k, 

Wrj^k - F{rj,k I Sj X Sk)\\uWsjxs, > W2[Pt(<S')]"^^ 
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(9.38) F{E,,j\Sj)<T, 

E,,j = L G P;. I ||S; - FiSj I Aj)Aj\\ui3),Ai > UsiFriS')]-^^ 

Finally, Pt(<S') = Pt(<S). We are using the notation (l9^28ll — (l930ll . 



The conclusion is that virtually all of the elements of the partitions P' and P'^, are 
uniform with respect to Gowers Norm, and the Box Norm. 

We emphasize that this Lemma provides us with a tower power bound. In (|9.35|) , we 
have the estimates below, where note that we have a log^, as in (|9.12|) , on the left. 



(9.39) log.(tJP;) < 2uf^T-'P2{Sf'-^^ + log. Pi (5) . 

Note that by (|9.36|) , the multiplicity of the partitions P'^, defined in (|9.6|) , are not 
increased in this procedure, though we get a very substantial increase in the multiplicity of 
the P', from the bound (|9.35|) , forming the principal loss in the application of this Lemma. 



The sets s, ^ Eij are 'very uniform,' even with respect to their probabilities in the respective 
cell of P'. The 'tower' notation in (|9.35|) is defined in (|9.11|) . 

Proof. We define a sequence of partition-systems. They are 

S{m) = {HxHxH, PH(m) , S, , P,(m) , Rj,k , Pj,k{m) , T , Pr(m) 
^ ■ ^ |l<z<4,l<;<fc<4} 

where <S(0) is the partition-system given to us by assumption. These partition-systems are 
refining, in the sense that the corresponding sequences of partitions are refining. 

In this process, the only incremental change to the partitions Pt(^) that are made are 



to make them subordinate to the other partitions. Thus, quantities that appear in (|9.37|) 
and (|9.38|) are constant. Namely, Q = Pr(<S(m)) is independent of m. 



We also define a sequence of stopping times a{j, k; m), and m{i, k) for 1 < j <k <A, and 
m> 0. Initialize these stopping times as follows, where 1 < / < fc < 4. 

m^O, oij, k;0)^0, mij, k) ^ 0, . 
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We choose C2 as in Proposition 19.231 The main recursion is this: Set 

(9.41) A = u^'t = u'^'tQ-^'-^' 

WHILE there are 1 < / < fc < 4 so that there is are two partitions P' and P^ which satisfy 
(|9.24|| and (|9.25|) above for the quantity A. Namely, 



P;. A P; is {Pj,k{m),A, Pj{m) A P^(m))-good. 



• The multiplicity of P' satisfies 



mult(P' I P/(m)) < 2 t [mult(P^(m) | Pnim)) ■ multi(P,fc(m) | P,(m) x Pfc(m))] 

(9.42) ' J' J 

< 2 T [mult(P,(m) I Pnim)) ■ multi(Py,,(0) | Py(0) X P,(0))] , 

and likewise for P' . 

k 

We take these steps. Update 

1. (Keep track of stopping times.) 

m <— m + 1 , m{i, k) <— m{i, k) + 1 , oij, k; m{j, A:)) <— m . 

2. (Select affine partition.) To each element of the affine partition Pnim), apply Lemma l9.19l 
to P', 1 < y < 4, with the parameter x that is given to us, and the value of u in 

Lemma l9.19l equal tou = UsQ~'^K Set the partition that Lemma l9. 191 supplies to us to 
be Pnim + 1). Observe that 

(9.43) codim(PH(m + 1)) < codim(PH(m)) + [(w3t)-^q]° 

This follows from Lemma [9.191 and (|9.22|) , for appropriate choice of constant D. Note 
that the term multi(P^. | Puim)) is bounded in (glU). 

3. (Updating the remaining partitions.) Set Pj{m + 1) to be the maximal partition which 
refines P' and is subordinate to Ph{^ + !)• Set Pj,k{^ + 1) to be the maximal partition 
which refines Pj^ki^)/ arid is subordinate to both Pj{ni + 1) and Pfc(m + 1). The last 
partition Pjitn + 1) is then defined. 
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At the conclusion of the WHILE loop, return this data: For 1 < j < k < 4, 

• m, the integers m{j, k). 

• The sequence of stopping times o{j,k; A), for < A < ni{j,k). 

It remains to argue that the partitions returned satisfy the conclusions of the Lemma. We 
must have (|9.37|) , else by the definition of A in (|9.41|) and Proposition |9.23l the routine would 



not have stopped. The conclusion (|9.36|) follows from the construction. The conclusion 
(|9.38|) follows from the manner in which we apply Lemma 19.191 in in particular the point 
(2) above. The remaining conclusions (|9.34|) and (|9.35|) require us to know how many 



recursions were performed. We turn to this next. 

We claim that 

m < A"^ = u~^^T-^Q^'-^^ . 

But this follows from Corollary 19.41 applied to the construction, the sets in Pj^k, and the 
stopping times o{{j, k}, Yj^k, A). 

Therefore, we have, by induction, and (|9.42|) , we have 



multi(P; I P') = multi(P/(m) | P(m)) 

< 2 t [P2 • multi(Pi(m - 1) | P(m - 1))] 

m times 



< 2T[P2-2T[P2---[P2-2TP2-Pl]---]]=l/^(^,Pl,P2), 

Here, the notation is from (|9.28[) , (|9.29[) , and Proposition |9.13[ which provides crude bound 
given in (|9.39|) . This proves (|9.35|) . The final conclusion (|9.34|) follows from this last bound 



anddHSj). 



n 



9.6 The Box Norm in Three Variables 

The goal of this section is to add the considerations about the Box Norm in three variables 
into our Lemmas, to build up an analog of Lemma [9.321 which also stipulates facts about 
the partition Pj, which as of yet we have not made any statements about. 
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9.44 Lemma. There are finite functions Wcodim / ^r : [0, 1]^ x N^ — > N so that the following 
holds for all < Ur, Tj < 1. 

For all trivial partition-systems S there is a partition-system S' as in (|9.33|) , such that 

(9.45) codim(^') < ^codim(HT, tt, P(r I H X H X H)) , 

(9.46) Ft{S') < Wt{ut, Tt, W{T\HxHxH)), 

(9.47) P(E|HxHxH)<tt, 

E := a G P'^ I Sj{t) is not UT-admissible I . 

Here, S3{t) is the trivial partition system associated with t, as defined in (|9.31|) . 



In (|9.47|) , admissibility is as in Definition 13.41 This proof will generate a second tower 



power in our estimate for the codimension in (|9.46|) , but we don't detail this particular fact. 

Proof. For this proof, we define a sequence of partition-systems S{m) as in (|9.40|) . These 
partition-systems are refining in the sense that the corresponding sequences of partitions 
are refining. We take S{0) to be the trivial partition-system given by the hypothesis of the 
Lemma. 

We also define a sequence of stopping times a{€,p() for 1 < { < A, with counters p^ > 0. 
Initialize these variables o{{, 0) <— and p^ <— 0, where 1 < ^ < 4. 

Here is the recursive algorithm. IF m is even, apply of Lemma [9.321 to S{m), with the 
values k{^UtTt)'~ and -^tt specified at the beginning of Lemma [9.44[ the Lemma we are 
proving. The value of C\ in Lemma [9.32| is the value of C -I- 1, where the constants k and C 
are as in the definition of admissible. Definition 13.41 

We then update m <^ m + 1, and take the updated data S{m) to be the partition-system 
from Lemma [9.32[ Observe that from (|9.35|) we have the estimates: 

(9.48) multi(P,(m) | P,(m - 1)) < ^2-(wt, ^tj, Pi(m - 1), P2(m - 1)) . 

IF m is odd, by the previous step, the conclusions of Lemma 19.321 are in force. The 
observation to make is that we have this condition. For the event B defined below, we 
have P(B) < Jxt. 

B = {te Prim) \ S3{t) satisfies ^M and 

(|3.7|) in the definition of UT-admissible.} 
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Recall that S^it) is given in (|9.31|) . That is, with very high probability, if the trivial partition- 
system Ssit) fails Uj-admissibility, it must be the condition (|3.5|) that fails. 

Let us see that this observation is true. The conditions (|9.37[) and (|9.38|) applied to S{m) 
hold. Thus, except on a set of probability at most ^Tj, we have, using the notation of 
(19311) , 



\\rt:j,k - ^{rt:j,k I St:j X St:k)\\ui,'^,,.^s,, ^ K(|TTWT)^[PT(<S(m))] ^ ^ 

Therefore, if the trivial partition-system S^it) fails either (|3.6|) or (|3.7|) in the definition of 
Uj-admissibility, it must follow that t has very small probability in its affine cell. Namely, 
we must have 

(9.50) P(^ I H,i X Ht:2 X H,3) < |PT(<S(m))-iTT . 

But certainly, by the definition of Pr(<S(m) in (|9.30|) , we have 

2]^ P(M H X H X H) < ixr . 



f : t satisfies 19301 

This means that P(B) < kr for B as in (l9^ . 



IF there is an 1 < ^ < 4 for which we have 

W{F{\HxHxH)> ixr, 
f ^ := {f G Prim) - B \ Ssit) does not satisfy ^^ for this value of ^} . 

For such a choice of {, update pe <^ p^ + 1, and set o{£, pe) <— m. For each f G Fg, we can 
apply Lemma 18.31 Write 

te = Sue Yl ^^--J'^ ■ 

l<i<k<i 

Apply Lemma 18.31 with V = ti, U = t, and t = kUj. Since ^ ^ B, it follows that V = t^ 
satisfies the hypothesis of that Lemma, namely that V = tgis (4, d, ^)-uniform, with d as in 
(|831) . 
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Then, from the conclusion of Lemma 18.31 we read this. There are partitions P{st:j, tf), 
1 < y < 4, of Stj into two sets, and partitions 

of ry J; into two sets, so that the there is an atom V in the partition 



P(Sf:^,i^)A l\ P{rtj,k,te) 

l<;<fc<4 

which has a higher correlation with t/;. Namely, 



W{V' I > c[KM^P(f I t()\ , 
P(^ I V) > W{t I tc) + c\KU^W{t I Uf\ 



p 

V 



Let 

P{U)= l\ P{r,,,tc). 

l<;'<fc<4 

It follows that we have 

(9.51) E[E(r n tc I P{tr)f > F{T \ tcf + w^P(T | kf . 

We update 

P;(m + 1) ^ P,(m), ii^i, 



P{Rj,k, m) A j\ P(Ry,^, U), 1 < ;■ < fc < 4 , ;, fc ^ ^ . 

It is this last two steps that create a second tower. Observe that we have, using the notation 
of (19:28b and (g^H), 



(9.52) Pu(<S(m)) < P„(<S(m - 1))2 T [2P2(<S(m - l))*^] u = 1, 2 . 



It follows from (|9.51|) that we have 
(9.53) E [E(T I PT,{m))f > E [E(T | Pr,(m - 1))]^ + ttU^¥{T \ Tcf . 
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The recursion then loops. 



Once the recursion has stopped, it follows from the construction, in particular (|9.53|) , 
and Proposition 19.51 that we must have 



(9.54) 

The sum 2 X!,^=i Pe bounds the running time. 



pA < t/u/^ , 



At the end of the recursion, the conclusion (|9.47|) holds. The other conclusions are 
appropriate upper bounds on the multiplicities in terms of some (very quickly growing) 
function of Ut, tt, and the multiplicities of the given partitions. These estimates follow 
from dUi), and (l^32b . 



To supply some details, let us set 



r(l) := WsiuT, \ij, Pi (5), P2.S) X [2 T [2P^]] , 

r(p + 1) := ^^{uj, iTr,r(p),r(p)) x [2 1 [2r(p)^]] 



From (19:281) . (19:291) , ¥M^ . (l932l) , and (l934b , we have 

mult(P/(m) I P(m)) < r(m) < T{^%-^u^^) ,i = 1,2. 

Since W3 is itself a power-tower, defined in terms of the 2 tt / function, we thus, have 
a second power-tower from this estimate. Since the partition Pj is generated from the 
prior partitions, this last estimate proves (|9.46|) . The estimate (|9.45|) follows from similar 
considerations, and the estimate (|9.34|) . n 



9.7 Proof of Lemma 13.17 



Recall that A c T, by assumption, and that P(A | T) > 6 -I- v. Apply Lemma 19.441 to the 
corner system J{ as in (|3.2|) . This Lemma also takes the parameters 



— r^f■^ 



m^b, Tt = cv^^'PiT \ H X H X H) . 



Here the constant Ct is the constant that appears Lemma I8.3[ see (|8.9|) . Let S' be the 
partition-system given to us by this Lemma, satisfying (|9.46|) and (|9.47|) . 
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Also consider the set 

E' := [tePjl F{t \ Ht,i x Ht,2 x Ht,3) < z;[Pr(<S')]-^P(T | H x H x H)) 

Here, we are using the notation of (|93T]| and (l930ll . Then, it is clear that P(U{^ U e E'}) < 
Tt". Hence, by the pigeonhole principle (See Proposition 19.71 ) we can select f G P^ so that 
1 1 E', and the T'-system ^3(0 is 6-admissible, which is (l3^20l) and P(A | T) > 6 + v/4 which 



is (l3J9b . The estimate (I3l8b follows from the estimate (l9^45l) . 



10 The Algorithm to Conclude the Main Theorem 

This is a well-known argument. To prove our main Theorem, we should show that 
for any < 6 < 1 there is an n{b) so that if dim(H) > n(6), and A c H x H x H with 
P(A I H X H X H) > 6, then A contains a corner. 

We recursively construct a sequence of comer-systems 

yi(m) = {H, Si{m) , Ri,j{m) , T{m),A{m) | 1 < z,; < 4} . 

J?l(0) is the 'trivial' corner-system 

R,(0) = H, S,-,;(0) = HxH, T = HxHxH, A{0)=A. 

Moreover, at each stage, A{m) c A, so that a corner in A{m) is a comer in A. 

The point is that the recursion, when it stops, provides us with an corner-system yi(mo) 
sothat(l)P(A(mo) | T(mo)) > 6, (2) yi(mo) is P(A(mo) | T(mo))-admissible, (3) yi(mo) satisfies 

(10.1) dim(H(mo)) > dim(H) - Odi^(6) , 

(10.2) P(r(mo) I H(mo) x H(mo) x H(mo)) > Oa,f(6) . 

Here, Odim is a map from [0, ] to N, and M^a,p(<5) is a finite function from [0, 1] to itself. 
Then, it follows that Lemma [3.13| implies A(mo) has a corner provided (|3.14[) holds, that is 

\H{mo)f > WOW A A^f- 
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By (|10.1|) , this will clearly hold provided dim(H) > n{5), for a computable function n{6). 
Thus, our Main Theorem is proved. 

The recursion is this: Given the comer-system J\{m), it will be P(A(m) | T{m))- 
admissible. If it does not satisfy (|3.15)) , then we apply Lemma 13.161 to conclude the 
existence of an corner-system 

^'{m) = {H'{m), S'.{m), R[j{m) , T{m),A'{m) \l<i,i< 4} 

satisfying these conditions: A'{m) c A{m), 

P(r(m) I T{m)) > K[P(A(m) | ^(m))]l/^ 
P(A'(m) I V{m)) > P(A(m) | T{m)) + K[P(A(m) | T{m))]^''' . 

These are the conclusions of Lemma [3. 161 

The comer-system yi'(m) need not be P(A'(m) | r'(m))-admissible, therefore, we apply 
Lemma [3. 171 with 

b = P(A(m) I T{m)) , v = K[P(A(m) | T(m))]^/'^' . 

The conclusion of this Lemma gives us a new comer-system ^{m + 1), which satisfies 

P(A(m + 1) I Tim + 1)) > P(A(m) | T{m)) + K[P(A(m) | T{m))]^''' 

^^°-^^ > 5 + Kb"^ 

¥(T(m + 1) I H(m + 1) x H(m + 1) x H(m + 1)))) 
(10.4) 

> ^r(F(A(m) | r(m)), F{T{m) \ H{m) x H{m) x H{m))) , 

codim(H(m + 1)) < ^codim(P(A(m) | T(m)), P(T(m) | H(m) x H{m) x H{m))) . 



The functions M^codim and M^j are derived from those in (|3.18|) and {3.11) by a change of 
variables. 

Note that (|10.3|) implies that the recursion can continue for at most Mq < 4(k6^^''')"^ times 
before it must stop, as the density of A{m) in T{m) can never be more than 1. Note that 
initially, we have 7(0) = H(0) x H(0) x H(0), therefore the iteration of the estimate (|10.4|) can 
be phrased completely in terms of a fixed function of 6 = P(A(0)), therefore the estimate 
(|10.2[) holds. A similar argument applies to prove the estimate (|10.1|) , completing the proof 
of our Main Theorem. 
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